Studying for the 70-457: Part Deux

January 31 is coming up pretty fast ,Just 3 more days, so I want to try to knock out the rest of the study guide in the next couple days so that I’m ready Thursday! Let’s jump right in!

Work with Data

  • Query data by using SELECT statements.
    • This objective may include but is not limited to: use the ranking function to select top(X) rows for multiple categories in a single query; write and perform queries efficiently using the new code items such as synonyms and joins (except, intersect); implement logic which uses dynamic SQL and system metadata; write efficient, technically complex SQL queries, including all types of joins versus the use of derived tables; determine what code may or may not execute based on the tables provided; given a table with constraints, determine which statement set would load a table; use and understand different data access technologies; CASE versus ISNULL versus COALESCE

You want to know what’s funny I seriously thought the left join with the default join. I had to actually look it up and MSDN to find out the inner joint is the default join.  I’ve become so used to always typing the join type, LEFT, INNER, RIGHT, CROSS, that when I just see JOIN, I think it’s a LEFT join. Dynamic sql and performance issues — You need to keep in mind if you’re going to use dynamic SQL that in doing so, you suffer a performance hit.  What I mean is you can’t reuse execution plans when you’re creating dynamic SQL.  Each time you run the dynamic SQL, the engine is going to try and create an execution plan for that call.  You can get around that by changing your server setting to “optimize for adhoc workloads”.  This will improve your performance if you do a lot of adhoc requests. My personal preference would be to understand why you’re using so much dynamic SQL and try to address the needs that caused the dynamic queries to be used.  But if you simply can’t address the root cause… the optimize option is there. DMVs — starting with SQL 2005 you got a bunch of dynamic management views and functions you could use to figure out what’s going on inside the server you could find out things like wait types and  how long you’ve been waiting.  You could find your worst performing queries by run time, or execution plans being used.  Using DMVs is what get’s me around having to break out profiler every time someone needs me to figure out why the server is slow. Correlated Subqueries, I’ve been using IN and  EXISTS  with subqueries for years.  It’s an easy way to figure out what’s missing from a load.  But with 2012, you can do something new.  You can compare a single value to a set. Let’s say you wanted to see all the salespeople that sold less than the top 10 sales people ( I know this is a contrived example, but work with me here.)

SELECT FirstName, LastName
FROM SalesPeople
WHERE SalesTotal < ALL (SELECT SalesTotal FROM v_TopTenSalesPeople)

The ALL keyword requires that the row’s value has to be < all the values in the view v_TopTenSalesPeople. The ANY or SOME keyword would require that the Sales total be less than one of those values.  So you’d get back the #2 through #10 sales person in your results, but not the Top sales person.  With the ANY or SOME, you could construct statements that would give you the same results as an EXISTS statement. CASE VS ISNULL VS COALESCE — check out this article over on stackexchange.

  • Implement sub-queries.
    • This objective may include but is not limited to: identify problematic elements in query plans; pivot and unpivot; apply operator; cte statement; with statement

PIVOT/UNPIVOT  –for now  check out the MSDN article.

  • Implement data types.
    • This objective may include but is not limited to: use appropriate data; understand the uses and limitations of each data type; impact of GUID (newid, newsequentialid) on database performance, when to use which data type for columns

I’ve seen it a lot: using a GUID column as the primary key (and by default a clustered index).  NEWID() doesn’t provide a sequential value.  It’s random.  So you’re going to get a ton of page splits on your insert operations.  That’s where people will come to the idea let’s use NEWSEQUENTIALID().  It is sequential, but it’s still not a great primary key.  When I need a globally unique identifier on a row, it’s cool to make that column a primary key, just find out what your natural sort order is for that data.

That way the normal operation of inserting new data will always append your new rows to the last page (or make a new page at the end), rather than suffering the performance hit of having to split a page, then move some data around, then inserting your data.  Trust me, when you start dealing with heavy insert loads, you’ll thank me.

 

I started this study guide Monday, and now it’s Wednesday…looks like I’m not going to finish the study guide before the exam.  Good thing is I have more exams scheduled later in the year.  I have time to create plenty more study guides. If you’re looking for something in particular, let me know.  I’ll help you find it!

 

Leave a Reply

Your email address will not be published. Required fields are marked *