开发者

SQL Server minus MySQL Features

I know MySQL and and trying to learn SQL Server . I am looking for features/k开发者_如何学编程eywords that are in SQL Server and not in MySQL. Eg: TOP, CLUSTERED/NONCLUSTERED Indexes etc..

Any links/pointers are appreciated.

Thanks!


I hope you don't mind me pasting from an answer I gave to someone else. The question was about performance in general, but in covering all aspects of performance I touched on most of the engine features as well. Giving yourself a thorough eduction in performance is going to also give you a thorough education in features.

So here are basic performance items to research. This is slanted toward MSSQL but not exclusive to it:

  • The basic logical data storage architecture of the system you're working with. For example, b-tree, extent, page, the sizes and configurations of these, how much data is read at once, the maximum size of a row (if that is an issue in your DBMS), what is done with out-of-row data (again if that is an issue in your DBMS).

  • Indexes, constraints, and basic ordering of tables and row data: heaps, clustered, nonclustered, unique and non-uniqueness of these indexes, primary keys, unique constraints, included columns. In all these indexes whether nulls are allowed, just one null is allowed, or none. Uniqueifiers. Covering index.

  • SARGability (look up SARG which is short for "Search ARGument").

  • Foreign keys, defaults, cascade deletes/update, their effect on inserts and deletes.

  • Whether NULLs require any storage space and if this is affected by column position. The number of bytes required to store each data type. When trailing spaces are stored or not stored for string data types. Packed vs. non-packed data types (e.g. float and decimal vs. integer). The concept of rows per page (or smallest unit of disk read) in both clustered and nonclustered indexes.

  • Fill factor, fragmentation, statistics, index selectivity, page splits, forwarding pointers.

  • When "batching" an operation can boost performance and why, and how to do it most efficiently.

  • INNER, LEFT, RIGHT, FULL, and CROSS JOINs. Semi-joins (EXISTS) and anti-semi-joins (NOT EXISTS). Any other language-specific syntax such as USING in mySql and CROSS APPLY/OUTER APPLY in SQL Server. The effect of putting a join condition in the ON clause of an outer join vs. putting it in the WHERE clause.

  • Independent subqueries, correlated subqueries, derived tables, common table expressions, understanding that EXISTS and NOT EXISTS generally appear to introduce a correlated subquery, but usually are seen in the execution plan as joins (semi or anti-semi joins).

  • Viewing and understanding execution plans either graphically or in text. Viewing the statistics/profile of CPU, reads, writes, and duration used by whole SQL batches or individual statements. Understanding the limitations of execution plans & profiles, which practically speaking means you generally have to use both to optimize well. Caching and reuse of execution plans, expiration of plans from the cache. Parameter sniffing and parameterization. Dynamic SQL in relation to these.

  • The relative costs of converting data types to other data types or just working with those data types. (For example, a solid rule of thumb is that working with strings is more costly than working with numbers.)

  • The generally exorbitant cost of row-by-row processing as opposed to set-based. The proper use for cursors (rare, though sometimes called for). How functions can hide execution plan costs. The tempting trap of writing functions that get called for every row when the problem could be solved in sets (though this can be tricky to learn how to see, especially because traditional application programming tends to train people to think in terms of functions like this).

  • Seeks, scans, range scans, "skip" scans. Bookmark lookups aka an index seek followed by table seek to the same table using the value found in the index seek. Loop, merge, and hash joins. Eager & lazy spools. Join order. Estimated row count. Actual row count.

  • When a query is too big and should be split into more than one, using temp tables or other means.

  • Multi-processor capabilities and the benefits and gotchas of parallel execution.

  • Tempdb or other temp file usage. Lifetime and scope of temp tables, table variables (if your DB engine has such). Whether statistics are collected for these (in SQL Server temp tables use statistics and table variables do not).

  • Locking, lock granularity, lock types, lock escalation, blocks, deadlocks. Data access pattern (such as UPDATE first, INSERT second, DELETE last). Intent, shared, exclusive locks. Lock hints (e.g. in SQL Server UPDLOCK, HOLDLOCK, READPAST, TABLOCKX).

  • Transactions and transaction isolation. Read committed, read uncommitted, repeatable read, serializable, snapshot, others I can't remember now.

  • Data files, file groups, separate disks, transaction logs, simple recovery, full recovery, oldest open transaction aka minimum log sequence number (LSN), file growth.

  • Sequences, arrays, lists, identity columns, windowing functions, TOP/rownum/limiting number of rows returned.

  • Materialized views aka indexed views. Calculated columns.

  • 1 to 1, 1 to 0 or 1, 1 to many, many to many.

  • UNION, UNION ALL, and other "vertical" joins. SQL Server has EXCEPT and INTERSECT, too.

  • Expansion of IN () lists to OR. Expansion of IsNull(), Coalesce(), or other null-handling mechanisms to CASE statements.

  • The pitfalls of using DISTINCT to "fix" a query instead of dealing with the underlying problem.

  • How linked servers do NOT do joins across the link well, queries to a linked server often become row-by-row, large amounts of data can be pulled across the link to perform a join locally even if this isn't sensible.

  • The pitfall of doing any I/O or error-prone task in a trigger. The scope of triggers (whether they fire for every row or once for each data operation).

  • Making the front-end, GUI, reporting tool, or other client do client-type work (such as formatting dates or numbers as strings) instead of the DB engine.

  • Error handling. Rolling back transactions and how this always rolls back to the first transaction no matter how deeply nested, but a COMMIT only commits one level of work.


It is not a good way to learn new DBMS. The difference is not in the sql dialect only, but in the how DBMS works, its behaviour, practices, etc. Each new DBMS should be learned by you as yours first one.


NewID() is one different keyword too but I think keywords are not the only thing you need to look for. Stored procedures definition also have differences. Security is different (dbo user, use of windows groups/users in MSSQL), networking and I agree with zerkms answer too that behavior and practices are different.


delete from tableA where columnA in(
select columnA from (
select a.*, row_number() over (order by columnA) rn
from tableA a)
where rn>100)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜