Intermediate

Sybase and Transact - SQL

 September 29,  1999

 
Review of SQL and the Relational Database Model

 

·        A Relational Database Management System RDBMS stores Tables within Databases, and provides rich possibilities for controlling who can access/change data, and for controlling the data being stored in the Database, through rules, constraints, triggers and other methods

 

 

·        Access to the data in a Database is only possible with a valid Login and Password; a default Database is usually assigned by the administrator of the Database, or sa, so there may be no obvious prompting for which Database is being used

 

 

·        A successful login establishes a connection over the network to the machine which contains the RDBMS; it’s possible that data is stored on multiple computers, using Sybase devices; there is no need to have user rights on the directory where the data is stored; that’s taken care of by Sybase

 

 

·        RDBMS’es also store information about the Tables themselves, supporting queries against the structure of the Database, in addition to more standard queries against data-bearing Tables; shown below is one standard way of listing names of all Tables in your current Database

 

SELECT

name

FROM

sysobjects

WHERE

type = ‘U’

ORDER BY

 name

 

 

·        The Structured Query Language (SQL) is the name of the command language of Sybase and all other RDBMS products, and the only way to communicate with an RDBMS, regardless of which ‘front end’, or Client, you are using in the Client-Server environment

 

 

·         SQL SELECT commands permit reporting;  other commands (INSERT, UPDATE, DELETE) modify Table contents

 

 

·        SQL commands ( Queries ) require good knowledge of the names of Database objects, and understanding of the business purpose and relationships of Tables in the Database

 

 

·        SQL SELECT queries generate a Result Set, which appears in Row and Column format to the Client application, which could be ISQL, SQL-Programmer, or any of several others, including Access, Actuate, Crystal Reports and Infomaker

 

 

·        Normal data values in a Result Set represent a single entity, transaction, line item or other record

 

SELECT Syntax Template

 

·        This is a template showing the range of possible syntaxes for a SELECT:

 

SELECT [ ALL | DISTINCT ]  select_list

     [FROM table_name [ ALIAS ] [,...]  ]

     [ WHERE search_conditions ]

     [ GROUP BY non_aggregate_expression ]

     [  HAVING group_level_condition ]

     [  ORDER BY { column_name | column_number [ {asc | desc } ]

 

 

WHERE:

select_list means Columns, Column Functions, Literals, or any combination of these

search_conditions can be AND’ed or OR’ed combinations

non_aggregate_expression is also known as Scalar, (single-valued)

group_level_condition is typically compares one of the aggregates in the select list of the GROUP’ed Result Set

 

NOTATION:

The [  and ] characters delimit any parts of the syntax which are optional

The { and } characters delimit any required syntax pieces, with the | character separating the valid choices

 

·        There is an alternate form of SELECT, ( SELECT INTO ), which permits the creation of a new Table ‘on the fly’, which automatically inherits the structure (Columns) and any Rows which meet any WHERE clause of the SELECT


Transact-SQL Functions

 

 

·        ASCII( character_expression )

Returns ASCII code value of left-most (only) byte of character_expression

 

·        CHAR( ASCII_code_integer )

Returns character corresponding to passed ASCII_code_integer

 

SELECT

ASCII( ‘A’ ) AS ‘ASCII()’,  CHAR( 65 ) AS ‘CHAR()’

 

·        CHARINDEX( search_string, expression_to_search )

Returns byte position of search_string within expression_to_search

 

SELECT

accountname,

CHARINDEX ( “ “, accountname ) AS ‘Using space’ ,

CHARINDEX ( “S“, accountname ) AS ‘Using S’

FROM

accounts

 

·        CEILING( numeric_expression )

    Returns smallest integer greater than or equal to numeric_expression

 

SELECT

CEILING ( 11.56 ), CEILING ( 0.41 ), CEILING ( .0004 )


 

 

·        CONVERT( new_datatype ,  expression_to_convert )

Changes datatype of expression_to_convert to the datatype specified in new_datatype

 

SELECT

SUM( effamt ) AS “Default SUM()” ,

CONVERT( MONEY, SUM( effamt )) AS “Formatted SUM()”

FROM

 lot_shares

 

·        DATEADD( datepart , increment , expression_to_use  )

   Adds an interval of type ( datepart * increment ) to the date expression passed in the third argument

 

 

·        DATEDIFF( datepart , earlier_date , more_recent_date )

Returns the amount of time, as measured in dateparts, between the two dates

 

 

·        DATENAME( datepart , date )

Produces the specified datepart of the specified date as a string

 

SELECT

DATENAME( WEEKDAY, GETDATE() ) , DATENAME( DAY, GETDATE(),

DATENAME( MONTH, GETDATE() ), DATENAME ( YEAR, GETDATE() )

 

·        DATEPART( datepart, date )


 

 

·        FLOOR(numeric_expression)

    Returns largest integer less than or equal to numeric_expression

 

SELECT

FLOOR ( 11.56 ), FLOOR( 0.41 ), FLOOR( .0004 )

 

 

·        ISNULL( expression_to_test, substitution_value )

Tests expression for NULL, and if NULL found, substitutes value in second argument; substitue value must be of same datatype as the column

 

SELECT

 accountname, ISNULL( residence, ‘!!!’ )

FROM

 accounts

 

 

·        POWER( numeric_expression , power )

Returns value of numeric_expression to the power of power

 

 

·        RTRIM( character_expression ) and LTRIM(character_expression )

     Trim trailing or leading spaces, respectively, from character_expression

 

SELECT

RTRIM( lastname ) + ‘, ‘ + RTRIM( firstname )

FROM

 personnel


 

 

·        SOUNDEX( character_expression )

Returns a ‘sounds like’ code which of itself has no value other than to use in comparison with other characters tested with SOUNDEX()

 

SELECT

SOUNDEX( ‘THREW’ ), SOUNDEX( ‘THROUGH’), SOUNDEX( ‘THROW’), SOUNDEX( ‘THOROUGH’)

 

 

·        STUFF(  character_expression, startbyte, length_to_delete, replacement_expression )

Within character_expression, deletes substring starting at startbyte, and replaces it with replacement_expression

 

 

·        SUBSTRING( character_expression, startbyte, length_to_return )

Starting at character # startbyte, returns a substring of length_to_return characters

 

The LIKE Search Operator

 

 

·        LIKE  permits pattern matching, and substring searches

 

“Show me GL Codes which start with any two characters (hence two _ characters inside quotes ), then have two 1’s, then have one more character of any kind”

 

SELECT

 DISTINCT gcaccount

FROM

 ledgers

WHERE

 gcaccount LIKE ‘__11_’

ORDER BY

 gcaccount

 

 

“Show me account names which start with Kelly or start with Scanlon, anywhere within the name; make sure that uppercased versions of the data do not throw off the search”

 

SELECT

 accountname

FROM

 accounts

WHERE

       UPPER ( accountname ) LIKE ‘KELLY%’

OR UPPER ( accountname ) LIKE ‘SCANLON%’

ORDER BY

 accountname

 

 

EXAMPLES:

 

 

“Show me account names which contain Kelly or Scanlon, anywhere within the name; this time, make it case-sensitive”

 

SELECT

 accountname

FROM

 accounts

WHERE

       accountname  LIKE ‘%Kelly%’

OR accountname  LIKE ‘%Scanlon%’

ORDER BY

 accountname

 

 

“Show me account names which have the words ‘of’ or ‘Of’ , anywhere within the name”

 

SELECT

 accountname

FROM

 accounts

WHERE

 accountname LIKE ‘%[Oo]f%’

ORDER BY

 accountname

 

 

“Show me account names which have three uppercase letters in a row, anywhere within the name”

 

SELECT

 accountname

FROM

 accounts

WHERE

 accountname LIKE ‘%[A-Z][A-Z][A-Z]%’

ORDER BY

 accountname

·        Actual specification of the ‘%’ and ‘_’ symbols as search targets is possible through the use of [ ] to cause different treatment of those wildcard characters

 

“Show me account names which have either the word ‘per cent solution’ or  ‘% solution’ anywhere within the name”

 

WHERE

       UPPER( accountname ) LIKE ‘[%] SOLUTION’

OR UPPER ( accountname ) LIKE ‘PER CENT SOLUTION’

 

 

·        LIKE can also be used with datetime data; an implicit conversion to varchar datatype is done before comparison

 

 

·        Caution should be exercised in using LIKE

 

1.     The criteria below could be done, and done better, by using standard ‘=‘ instead of LIKE in the WHERE clause; don’t confuse LIKE with exact match; LIKE is a less efficient operation:

 

WHERE

       UPPER ( accountname ) LIKE ‘KELLY

OR UPPER ( accountname ) LIKE ‘SCANLON

 

2.     Similarly, the LIKE wildcard characters have no special meaning without LIKE; the following WHERE clause would require an exact match, to include the % character as something which is expected to be stored in that Column

 

WHERE

       UPPER ( accountname ) = ‘KELLY%’

OR UPPER ( accountname ) = ‘SCANLON%’

 

Other Search and Select Features

 

 

·        Use SET ROWCOUNT to truncate very large Result Sets and get ‘samples’

 

 

SET ROWCOUNT 100

 

SELECT

*

FROM

price

SET ROWCOUNT 0

 

 

  

·        Indicate an apostrophe character with two apostrophes, as:

 

 

WHERE

lastname = ‘O’’Toole’

 

 

Aggregates - Review

 

 

·        Aggregate Functions (AVG, SUM, COUNT, MIN, MAX) can return Rows which represent one or more ‘single-valued’, or  Scalar Rows;  there are only specific circumstances where Aggregate and Scalar data expressions can coexist:

 

1.     In a nested query, where the ‘inner’ query could use an Aggregate, such as AVG, to generate a single value which is then used by the ‘outer’ query as part of its WHERE clause

 

2.     In a GROUP BY clause, where the Scalar Column Expressions serve as ‘labels’ for the Aggregates

 

 

EXAMPLE of Aggregate used in ‘inner’ query

 

“Show me all  ledgers entries for position dates in January - June 1998 which have a current balance greater than the average balance for that same period ”

 

 

SELECT

 *

FROM

 ledgers

WHERE

         current_bal   >

 

 

AND  positiondate

( SELECT AVG( current_bal )   FROM ledgers

  WHERE positiondate BETWEEN ‘1/1/1998’ AND ‘6/30/1998’  )

 

BETWEEN ‘1/1/1998’ AND ‘6/30/1998’

ORDER BY

 act_id,  gcaccount, curr


Aggregates - Review

 

"Tell me how many PIC accounts we have (That's acprodgroup 37, Personal Investment Counsel)"

 

SELECT

COUNT( *) AS "PIC Accounts on file"

FROM

accounts

WHERE

acprodgrp = 37

 

 

 

 

“Show me all  ledgers entries for position dates in June 1998 which have a position date on the last Friday of June ”

 

SELECT

 *

FROM

 ledgers

WHERE

 positiondate =

 

 

( SELECT MAX( positiondate )   FROM ledgers

  WHERE DATEPART( month , positiondate ) = 6

  AND DATEPART( weekday, positiondate ) = 6 )

ORDER BY

 act_id,  gcaccount, curr

 

 

EXAMPLE of Aggregate used in grouping

 

“Show me how many accounts per residence code (residence) ; ignore missing or undefined values in residence

 

SELECT

 residence , COUNT(*)

FROM

 accounts

WHERE  

 residence IS NOT NULL

 

GROUP BY

 residence

 


Joins - Equi-Join

 

 

·        Joins reflect, and work with, the normal, ‘atomic’ nature of Tables in an RDBMS, where many Tables may finally hold the information you need

 

 

·        Joins permit the run-time construction of virtual Rows from two or more Tables, given some relationship between the Tables based on common data values

 

 

·        Joins require a Foreign Key, potentially multi-column, whose data contents are necessary and sufficient to ‘reach out’ to related Tables to get data they stored for return in the Result Set

 

·        If Column names are spelled the same in Tables you are joining, you must qualify Column name with Table name; a Table alias is usually utilized to minimize typing in such Column name qualification

 

·        Because we will be introducing two new Join options, we must formally describe our original example of a Join in Day 1 to be an Equi-Join, as it was based on equality (an ‘exact match’ ) between Column values in two Tables; such a Join is often also called an ‘Inner Join’---see later discussion of Outer Joins

 

EXAMPLE of Equi-Join:

 

“Show me account information and broker information for all accounts who have brokers”

 

SELECT 

A.act_id, acnomajor AS “Account #”, acnominor AS “Portfolio #”,

accountname, brtype, brname1

FROM

 accounts A, bromast B, broact AtoB

WHERE

          A.act_id =  AtoB.act_id

AND  B.brbroker = AtoB. brbroker

AND  B.bractive =  ‘A’

 

 

 

·        Since a ‘many-to-many’ relationship exists between the entities in the accounts and bromast (Broker Master) Tables, the broact Table has been created in the Database, hence the alias ‘A to B’

 

·        broact is an example of how relationships (‘relations’) can also be valid and valuable contents of a relational database

 

 

·        As a general rule, for every n  Tables involved in Joins, you need n-1 clauses as a minimum

 

 

·        Joins and Nested Queries ( Subqueries ) can often accomplish the same effect; consider query below:

 

“Show me everything for those Rows in  ledgers which have a residence code listed in our state lookup table”

 

 

·        Nested Query ( Subquery )

 

SELECT

 *

FROM

 accounts

WHERE

 residence IN 

( SELECT statecode

  FROM state )

ORDER BY

 residence

 

 

·        An equivalent Join:

 

SELECT

 *

FROM

 accounts , state

WHERE

 residence = statecode

 

  

·        Remember, as in the above example, that Column names need not be spelled the same; the only requirement is that the Columns being used have similar data

 

·        Your choice of whether to use a Join or a Nested Query is stylistic and is your preference

 

Joins - Self-Join

 

 

·        A Self-Join is typically used for side-by-side comparison of Rows from the same Table; you are interested in comparing values in one or more Columns among ‘peer’ records

 

 

·        Self-Joins rely upon using at least two Table aliases for the same Table, for the ‘left’ side and the ‘right’ side

 

 

·        Because you will get twice as many ‘hits’ in your Join due to the fact that value A = value B and value B = value A, you must always pose an additional criterion in your WHERE clause which will weed out one of the pairings; the multipart addition to the WHERE clause cited below seems to be the only way to successfully remove the extra pairings

 

 

EXAMPLE of Self-Join:

 

“Let me see act_id, GL Account Code (gcaccount ) and balance ( current_bal ) for those Rows in ledgers which have  the same whole dollar value”

 

SELECT

 L.act_id, L.gcaccount, L.positiondate, CONVERT( INTEGER , L.current_bal ) ,

 R.act_id, R.gcaccount, R.positiondate

FROM

 ledgers L , ledgers R

WHERE

          CONVERT( INTEGER , L.current_bal ) = CONVERT( INTEGER , R.current_bal )

AND ( L.act_id > R.act_id OR L.gcaccount > R.gcaccount  OR L.positiondate > R.positiondate )

ORDER BY

L.act_id

 

Joins - Outer Join

 

 

·        It is important to understand the Joins will naturally not display all Rows from joined Tables if there are no matching values in both Tables; Joins can easily leave out a lot of useful information

 

 

·        An Outer Join permits you to reach out for Rows which do not have a matching member in the other Table in your WHERE clause; it can represent ‘unassigned’ values

 

 

·        The Outer Join operator is a variation on the one used for Equi-Joins; its member Tables are often referred to as the Outer Table and the Inner Table

 

 

·        The Outer Table in this usage is the Table which you expect has the ‘data overhang’, that is it has values in the join Column which may not have a matching value in the other, Inner Table

 

 

·        The Outer Join operators are:

 

*= includes all rows from the Table on the left side of the operator that meet the statement’s restrictions, generating nulls in mismatched Columns in the Table specified on the right side of the operator -- The Table on the left is the Outer Table

 

=* includes all rows from the Table on the right side of the operator that meet the statement’s restrictions, generating nulls in mismatched Columns in the Table specified on the left side of the operator -- The Table on the right is the Outer Table

 

 

 EXAMPLE of Outer Join:

 

“I need to see all residence ( residence ) information for accounts, as well as those accounts with no residence value”

 

SELECT

 act_id, accountname, residence, statecode, statename

FROM

 accounts , state

WHERE

 residence =* statecode

 

(Notice nulls displaying as part of the Result Set; the server populates these ‘phantom’ Columns as part of the processing)

 

 

·        The ISNULL() Function is good for Outer Joins, to provide a sensible substitution for the nulls which are generated

 

SELECT

 act_id, accountname, residence, statecode,  ISNULL( statename , “N/A” ) AS “Residence”

FROM

 accounts  ,  state

WHERE

 residence =* statecode

 

 

·        One limitation of ISNULL() is the fact that the data values it generates are subject to the display length of the Column it is servicing; therefore the application of ISNULL shown below would not work, given the CHAR(2) datatype of residence

 

SELECT

 act_id, accountname, ISNULL( residence, “None Listed” )

FROM

 accounts

 

 

Nested Queries

 

 

·        Because we will be introducing how to coordinate, or ‘Correlate’ a nested query with each Row in an outer query, we can now formally distinguish between Non-Correlated Subqueries, which were introduced in Day 1, and Correlated Subqueries, which will be introduced on the next page.

 

 

·        Non-Correlated Subqueries are resolved ( calculated ) once during the query, and the data value(s) generated is/are used by the outer query for every Row processed by the outer query

 

EXAMPLE of Non-Correlated Subquery:

 

“Show me everything about accounts ( investment entities ) which have a residence code listed in our state lookup table”

 

SELECT

 *

FROM

 accounts

WHERE

 residence IN 

( SELECT statecode

  FROM state )

ORDER BY

 residence

 

“Show me everthing in ledgers for the most recent positiondate”

 

SELECT

 *

FROM

 ledgers

WHERE

 positiondate = ( SELECT MAX(  positiondate )

                            FROM ledgers )

ORDER BY

 act_id, gcaccount

 

Correlated Subqueries

 

 

·        A Subquery which relies upon at least one Column content in the outer query needs to have its WHERE clause coordinated, or ‘correlated’ with the outer query; this is largely done through syntax---a Table alias is used to indicate the outer query

 

·        This alias is often referred to as a Correlation Name; it is assigned in the outer query and used in the inner query

 

 

" Show me ledgers  entries which are the most recent entry for that act_id, gcaccount, curr combination"

 

SELECT

 *

FROM

 ledgers OUTERROW

WHERE

 positiondate =   ( SELECT MAX( positiondate )

                              FROM ledgers

                              WHERE ledgers.act_id = OUTERROW.act_id

                              AND  ledgers.gcaccount = OUTERROW.gcaccount

                             AND ledgers.curr = OUTERROW.curr )

ORDER BY

 act_id,  gcaccount,  curr

 

·        A Table alias may be used for the Table used in the inner query, but this is not required

  

EXAMPLE of Correlated Subquery in a GROUP BY query:

 

“Let me see the most recent date ( positiondate ) on which a ledgers entry had above average balance ( current_bal ) for that act_id, GL Account and currency ”

 

SELECT

 act_id AS “ID”,  

 gcaccount AS “GL Code”, 

 curr AS “Currency”,

 MAX( positiondate )  AS “Most Recent”

FROM

 ledgers OUTERROW

WHERE

 current_bal >  ( SELECT AVG( current_bal )

                              FROM ledgers

                              WHERE ledgers.act_id = OUTERROW.act_id

                              AND  ledgers.gcaccount = OUTERROW.gcaccount

                             AND ledgers.curr = OUTERROW.curr )

GROUP BY

 act_id,  gcaccount,  curr

 

Temporary Tables

 

·        Temporary Tables permit you to ‘freeze’ certain views of your data

 

·        Tables automatically ‘scratched’ when you log off

 

·        Often used within sprocs; Temporary Tables created within an sproc are scratched as soon as the sproc ends

 

·        Very helpful for storing summaries for easier, faster joins

 

·        Make it possible to create more sophisticated subqueries (e.g., using DATEPART, DATEDIFF in correlated subqueries )

 

·        Are created with CREATE TABLE statement or SELECT INTO statement; permissions to run these statements may not have been granted by sa

 

·        Are stored in a separate Database, named tempdb

 

 

EXAMPLE of creating Temporary Table from another Table; Column names automatically ‘inherited’ by Temporary Table

 

 

SELECT

 *

INTO

 #newyork

FROM

 accounts

WHERE

 residence = ‘NY’

 

 

EXAMPLE of creating Temporary Table from another Table with Aggregates:

 

 

SELECT

 act_id AS “ID”,  

 gcaccount AS “GL_Code”, 

 curr AS “Currency”,

 DATEPART( MONTH, MAX( positiondate ) ) AS “MaxMonth”

INTO

 #ledgers_rollup

FROM

 ledgers

WHERE

 current_bal >  ( SELECT AVG( current_bal )

                              FROM ledgers

                              WHERE ledgers.act_id = OUTERROW.act_id

                              AND  ledgers.gcaccount = OUTERROW.gcaccount

                             AND ledgers.curr = OUTERROW.curr )

GROUP BY

 act_id,  gcaccount,  curr

 

(Test the outcome by running the query SELECT * FROM #ledgers_rollup; notice the new Column names )

 

 

Sybase’s Transact-SQL extension to SQL

 

·        You can use Transact-SQL (T-SQL), which is a ‘superset’ of industry-standard SQL, to do the following:

 

1.     Create variables to hold results of queries   --- see Batches

 

2.     Use variables in queries   --- see Batches

 

3.     Do conditional processing ( IF ... ELSE )

 

4.     Loop through a series of executable statements until a threshold condition is met by the processing

  

·        Transact-SQL provides a procedural language for working with sets of data, and can be used by database administrators too

 

 

·        Stored Procedures ( sprocs ) make good use of the Transact-SQL extensions cited above

 

 

·        A special form of Stored Procedure is the Sybase Trigger, which automatically ‘fires’ on any INSERT, UPDATE or DELETE action on the associated Table; Triggers are optional, and must be created by a programmer

 

 

Transact-SQL Batches

 

 

·        You can use a Transact-SQL Batch of queries to group multiple operations; by definition, a Batch is simply more than one query which you send to the Server

 

 

·        Batches do not have names, and do not get stored as objects in the Database; their ‘source code’ can be saved to a text file and retrieved and reused, however

 

 

·        You can create Variables to hold results of queries using the DECLARE statement of Transact-SQL; user-defined variables must start with a single @ character, and must have a datatype

 

DECLARE @RowCount FLOAT ,  @Today DATETIME , @Ctr INTEGER

 

 

·        System Variables ( parts of the Sybase environment ) are available in a ‘readonly’ fashion; their names always begin with two @ characters

 

 

·        Variables are given values by you through the SELECT statement, which can either be the result of a constant, a function, or a embedded SQL query

 

SELECT @Today  =

GETDATE()

SELECT @Ctr  =

1

SELECT @Ctr  =

@Ctr + 1

 

·        Include descriptive comments as needed in the Batch with  /*  and  */

 

 

·        Within a Batch you can create a temporary Table to help break a complex query into pieces

 

 

·         You can use the PRINT statement of Transact-SQL to display variable contents

 

 

·        Use %1!, %2!, %3!, etc. as ‘placeholders’ within PRINT string for data expressions

 

DECLARE @RowCount FLOAT ,    @Today DATETIME

SELECT @Today  = GETDATE()

 

 

/* The 2 SELECTs could have been done in one query */

 

SELECT  @RowCount  = COUNT(*)  

 

FROM price 

 

PRINT “There are %1! Rows in the Price Table” , @RowCount