|
Intermediate Sybase and Transact - SQL September 29, 1999
· A Relational Database Management System RDBMS stores Tables within Databases, and provides rich possibilities for controlling who can access/change data, and for controlling the data being stored in the Database, through rules, constraints, triggers and other methods
· Access to the data in a Database is only possible with a valid Login and Password; a default Database is usually assigned by the administrator of the Database, or sa, so there may be no obvious prompting for which Database is being used
· A successful login establishes a connection over the network to the machine which contains the RDBMS; it’s possible that data is stored on multiple computers, using Sybase devices; there is no need to have user rights on the directory where the data is stored; that’s taken care of by Sybase
· RDBMS’es also store information about the Tables themselves, supporting queries against the structure of the Database, in addition to more standard queries against data-bearing Tables; shown below is one standard way of listing names of all Tables in your current Database
· The Structured Query Language (SQL) is the name of the command language of Sybase and all other RDBMS products, and the only way to communicate with an RDBMS, regardless of which ‘front end’, or Client, you are using in the Client-Server environment
· SQL SELECT commands permit reporting; other commands (INSERT, UPDATE, DELETE) modify Table contents
· SQL commands ( Queries ) require good knowledge of the names of Database objects, and understanding of the business purpose and relationships of Tables in the Database
· SQL SELECT queries generate a Result Set, which appears in Row and Column format to the Client application, which could be ISQL, SQL-Programmer, or any of several others, including Access, Actuate, Crystal Reports and Infomaker
· Normal data values in a Result Set represent a single entity, transaction, line item or other record
SELECT Syntax Template
· This is a template showing the range of possible syntaxes for a SELECT:
SELECT [ ALL | DISTINCT ] select_list [FROM table_name [ ALIAS ] [,...] ] [ WHERE search_conditions ] [ GROUP BY non_aggregate_expression ] [ HAVING group_level_condition ] [ ORDER BY { column_name | column_number [ {asc | desc } ]
WHERE: select_list means Columns, Column Functions, Literals, or any combination of these search_conditions can be AND’ed or OR’ed combinations non_aggregate_expression is also known as Scalar, (single-valued) group_level_condition is typically compares one of the aggregates in the select list of the GROUP’ed Result Set
NOTATION: The [ and ] characters delimit any parts of the syntax which are optional The { and } characters delimit any required syntax pieces, with the | character separating the valid choices
· There is an alternate form of SELECT, ( SELECT INTO ), which permits the creation of a new Table ‘on the fly’, which automatically inherits the structure (Columns) and any Rows which meet any WHERE clause of the SELECT Transact-SQL Functions
· ASCII( character_expression ) Returns ASCII code value of left-most (only) byte of character_expression
· CHAR( ASCII_code_integer ) Returns character corresponding to passed ASCII_code_integer
· CHARINDEX( search_string, expression_to_search ) Returns byte position of search_string within expression_to_search
· CEILING( numeric_expression ) Returns smallest integer greater than or equal to numeric_expression
· CONVERT( new_datatype , expression_to_convert ) Changes datatype of expression_to_convert to the datatype specified in new_datatype
· DATEADD( datepart , increment , expression_to_use ) Adds an interval of type ( datepart * increment ) to the date expression passed in the third argument
· DATEDIFF( datepart , earlier_date , more_recent_date ) Returns the amount of time, as measured in dateparts, between the two dates
· DATENAME( datepart , date ) Produces the specified datepart of the specified date as a string
· DATEPART( datepart, date )
· FLOOR(numeric_expression) Returns largest integer less than or equal to numeric_expression
· ISNULL( expression_to_test, substitution_value ) Tests expression for NULL, and if NULL found, substitutes value in second argument; substitue value must be of same datatype as the column
· POWER( numeric_expression , power ) Returns value of numeric_expression to the power of power
· RTRIM( character_expression ) and LTRIM(character_expression ) Trim trailing or leading spaces, respectively, from character_expression
· SOUNDEX( character_expression ) Returns a ‘sounds like’ code which of itself has no value other than to use in comparison with other characters tested with SOUNDEX()
· STUFF( character_expression, startbyte, length_to_delete, replacement_expression ) Within character_expression, deletes substring starting at startbyte, and replaces it with replacement_expression
· SUBSTRING( character_expression, startbyte, length_to_return ) Starting at character # startbyte, returns a substring of length_to_return characters
The LIKE Search Operator
· LIKE permits pattern matching, and substring searches
“Show me GL Codes which start with any two characters (hence two _ characters inside quotes ), then have two 1’s, then have one more character of any kind”
“Show me account names which start with Kelly or start with Scanlon, anywhere within the name; make sure that uppercased versions of the data do not throw off the search”
EXAMPLES:
“Show me account names which contain Kelly or Scanlon, anywhere within the name; this time, make it case-sensitive”
“Show me account names which have the words ‘of’ or ‘Of’ , anywhere within the name”
“Show me account names which have three uppercase letters in a row, anywhere within the name”
· Actual specification of the ‘%’ and ‘_’ symbols as search targets is possible through the use of [ ] to cause different treatment of those wildcard characters
“Show me account names which have either the word ‘per cent solution’ or ‘% solution’ anywhere within the name”
· LIKE can also be used with datetime data; an implicit conversion to varchar datatype is done before comparison
· Caution should be exercised in using LIKE
1. The criteria below could be done, and done better, by using standard ‘=‘ instead of LIKE in the WHERE clause; don’t confuse LIKE with exact match; LIKE is a less efficient operation:
2. Similarly, the LIKE wildcard characters have no special meaning without LIKE; the following WHERE clause would require an exact match, to include the % character as something which is expected to be stored in that Column
Other Search and Select Features
· Use SET ROWCOUNT to truncate very large Result Sets and get ‘samples’
· Indicate an apostrophe character with two apostrophes, as:
Aggregates - Review
· Aggregate Functions (AVG, SUM, COUNT, MIN, MAX) can return Rows which represent one or more ‘single-valued’, or Scalar Rows; there are only specific circumstances where Aggregate and Scalar data expressions can coexist:
1. In a nested query, where the ‘inner’ query could use an Aggregate, such as AVG, to generate a single value which is then used by the ‘outer’ query as part of its WHERE clause
2. In a GROUP BY clause, where the Scalar Column Expressions serve as ‘labels’ for the Aggregates
EXAMPLE of Aggregate used in ‘inner’ query
“Show me all ledgers entries for position dates in January - June 1998 which have a current balance greater than the average balance for that same period ”
Aggregates - Review
"Tell me how many PIC accounts we have (That's acprodgroup 37, Personal Investment Counsel)"
“Show me all ledgers entries for position dates in June 1998 which have a position date on the last Friday of June ”
EXAMPLE of Aggregate used in grouping
“Show me how many accounts per residence code (residence) ; ignore missing or undefined values in residence ”
Joins - Equi-Join
· Joins reflect, and work with, the normal, ‘atomic’ nature of Tables in an RDBMS, where many Tables may finally hold the information you need
· Joins permit the run-time construction of virtual Rows from two or more Tables, given some relationship between the Tables based on common data values
· Joins require a Foreign Key, potentially multi-column, whose data contents are necessary and sufficient to ‘reach out’ to related Tables to get data they stored for return in the Result Set
· If Column names are spelled the same in Tables you are joining, you must qualify Column name with Table name; a Table alias is usually utilized to minimize typing in such Column name qualification
· Because we will be introducing two new Join options, we must formally describe our original example of a Join in Day 1 to be an Equi-Join, as it was based on equality (an ‘exact match’ ) between Column values in two Tables; such a Join is often also called an ‘Inner Join’---see later discussion of Outer Joins
EXAMPLE of Equi-Join:
“Show me account information and broker information for all accounts who have brokers”
· Since a ‘many-to-many’ relationship exists between the entities in the accounts and bromast (Broker Master) Tables, the broact Table has been created in the Database, hence the alias ‘A to B’
· broact is an example of how relationships (‘relations’) can also be valid and valuable contents of a relational database
· As a general rule, for every n Tables involved in Joins, you need n-1 clauses as a minimum
· Joins and Nested Queries ( Subqueries ) can often accomplish the same effect; consider query below:
“Show me everything for those Rows in ledgers which have a residence code listed in our state lookup table”
· Nested Query ( Subquery )
· An equivalent Join:
· Remember, as in the above example, that Column names need not be spelled the same; the only requirement is that the Columns being used have similar data
· Your choice of whether to use a Join or a Nested Query is stylistic and is your preference
Joins - Self-Join
· A Self-Join is typically used for side-by-side comparison of Rows from the same Table; you are interested in comparing values in one or more Columns among ‘peer’ records
· Self-Joins rely upon using at least two Table aliases for the same Table, for the ‘left’ side and the ‘right’ side
· Because you will get twice as many ‘hits’ in your Join due to the fact that value A = value B and value B = value A, you must always pose an additional criterion in your WHERE clause which will weed out one of the pairings; the multipart addition to the WHERE clause cited below seems to be the only way to successfully remove the extra pairings
EXAMPLE of Self-Join:
“Let me see act_id, GL Account Code (gcaccount ) and balance ( current_bal ) for those Rows in ledgers which have the same whole dollar value”
Joins - Outer Join
· It is important to understand the Joins will naturally not display all Rows from joined Tables if there are no matching values in both Tables; Joins can easily leave out a lot of useful information
· An Outer Join permits you to reach out for Rows which do not have a matching member in the other Table in your WHERE clause; it can represent ‘unassigned’ values
· The Outer Join operator is a variation on the one used for Equi-Joins; its member Tables are often referred to as the Outer Table and the Inner Table
· The Outer Table in this usage is the Table which you expect has the ‘data overhang’, that is it has values in the join Column which may not have a matching value in the other, Inner Table
· The Outer Join operators are:
*= includes all rows from the Table on the left side of the operator that meet the statement’s restrictions, generating nulls in mismatched Columns in the Table specified on the right side of the operator -- The Table on the left is the Outer Table
=* includes all rows from the Table on the right side of the operator that meet the statement’s restrictions, generating nulls in mismatched Columns in the Table specified on the left side of the operator -- The Table on the right is the Outer Table
EXAMPLE of Outer Join:
“I need to see all residence ( residence ) information for accounts, as well as those accounts with no residence value”
(Notice nulls displaying as part of the Result Set; the server populates these ‘phantom’ Columns as part of the processing)
· The ISNULL() Function is good for Outer Joins, to provide a sensible substitution for the nulls which are generated
· One limitation of ISNULL() is the fact that the data values it generates are subject to the display length of the Column it is servicing; therefore the application of ISNULL shown below would not work, given the CHAR(2) datatype of residence
Nested Queries
· Because we will be introducing how to coordinate, or ‘Correlate’ a nested query with each Row in an outer query, we can now formally distinguish between Non-Correlated Subqueries, which were introduced in Day 1, and Correlated Subqueries, which will be introduced on the next page.
· Non-Correlated Subqueries are resolved ( calculated ) once during the query, and the data value(s) generated is/are used by the outer query for every Row processed by the outer query
EXAMPLE of Non-Correlated Subquery:
“Show me everything about accounts ( investment entities ) which have a residence code listed in our state lookup table”
“Show me everthing in ledgers for the most recent positiondate”
Correlated Subqueries
· A Subquery which relies upon at least one Column content in the outer query needs to have its WHERE clause coordinated, or ‘correlated’ with the outer query; this is largely done through syntax---a Table alias is used to indicate the outer query
· This alias is often referred to as a Correlation Name; it is assigned in the outer query and used in the inner query
" Show me ledgers entries which are the most recent entry for that act_id, gcaccount, curr combination"
· A Table alias may be used for the Table used in the inner query, but this is not required
EXAMPLE of Correlated Subquery in a GROUP BY query:
“Let me see the most recent date ( positiondate ) on which a ledgers entry had above average balance ( current_bal ) for that act_id, GL Account and currency ”
Temporary Tables
· Temporary Tables permit you to ‘freeze’ certain views of your data
· Tables automatically ‘scratched’ when you log off
· Often used within sprocs; Temporary Tables created within an sproc are scratched as soon as the sproc ends
· Very helpful for storing summaries for easier, faster joins
· Make it possible to create more sophisticated subqueries (e.g., using DATEPART, DATEDIFF in correlated subqueries )
· Are created with CREATE TABLE statement or SELECT INTO statement; permissions to run these statements may not have been granted by sa
· Are stored in a separate Database, named tempdb
EXAMPLE of creating Temporary Table from another Table; Column names automatically ‘inherited’ by Temporary Table
EXAMPLE of creating Temporary Table from another Table with Aggregates:
(Test the outcome by running the query SELECT * FROM #ledgers_rollup; notice the new Column names )
Sybase’s Transact-SQL extension to SQL
· You can use Transact-SQL (T-SQL), which is a ‘superset’ of industry-standard SQL, to do the following:
1. Create variables to hold results of queries --- see Batches
2. Use variables in queries --- see Batches
3. Do conditional processing ( IF ... ELSE )
4. Loop through a series of executable statements until a threshold condition is met by the processing
· Transact-SQL provides a procedural language for working with sets of data, and can be used by database administrators too
· Stored Procedures ( sprocs ) make good use of the Transact-SQL extensions cited above
· A special form of Stored Procedure is the Sybase Trigger, which automatically ‘fires’ on any INSERT, UPDATE or DELETE action on the associated Table; Triggers are optional, and must be created by a programmer
Transact-SQL Batches
· You can use a Transact-SQL Batch of queries to group multiple operations; by definition, a Batch is simply more than one query which you send to the Server
· Batches do not have names, and do not get stored as objects in the Database; their ‘source code’ can be saved to a text file and retrieved and reused, however
· You can create Variables to hold results of queries using the DECLARE statement of Transact-SQL; user-defined variables must start with a single @ character, and must have a datatype
· System Variables ( parts of the Sybase environment ) are available in a ‘readonly’ fashion; their names always begin with two @ characters
· Variables are given values by you through the SELECT statement, which can either be the result of a constant, a function, or a embedded SQL query
· Include descriptive comments as needed in the Batch with /* and */
· Within a Batch you can create a temporary Table to help break a complex query into pieces
· You can use the PRINT statement of Transact-SQL to display variable contents
· Use %1!, %2!, %3!, etc. as ‘placeholders’ within PRINT string for data expressions
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||