sexta-feira, 22 de janeiro de 2010

Transparent Data Encryption on SQL 2008 (TDE)

Actually over the years sql server has improved a lot in terms of security. In the beginning, a long time ago, the industry didn´t have much concerns about security, even because this kind of implementation requires cost of development, implies in a more rigid development policies (to develop minimizing the surface area of attack), and the types of attack were not as severe as they are today (and today they imply in loosing money).

Over the last two versions of SQL Server (2005 and 2005) Microsoft made a big effor in making the product more secure. The reasons are simple. If they want SQL Server grow in more mission critical applications, the system must be protected from attacks, and more critical the application and the data, SQL Server must be more secure in terms of data access.

That´s why the term “secure by default” became a clichè in many Microsoft presentations regarding SQL Server.

One of the features on SQL 2008 related to security that at the same time is simple to implement and important for highly confidential database is Transparent Data Encryption. It covers two different aspects.

One, if you manage to detach a SQL Server database you might virtually attach it back in any other SQL Server instance. In this case someone with access to the system can steal the information and bypass security and auditing by simply creating a copy of the database. Which is a problem in terms of confidentiality.

Other is less common but the data inside the database could be exposed by some disk editing tool, which can check the database file internally, once the database file has no level of encryption it could be a risk, someone sniffing your disk searching for some information.

Transparent Data Encryption addressses these problems by creating all data inside the database encrypted. The data inside the database file is coded, and at the same time you cannot attach the database on other server, without having installed the key and certificatesd to decrypt the database

The process consists in creating a Master key, and a certificate. Later using this certificate to encrypt the database.  The master key, should have a password which must be strong in order to be more difficult to break by brute force, for example:

 

/*
Script enable Transparent Data Encryption on AdventureWorks database
*/
USE master;
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Fum0V0rt3m0@$ql';
go
CREATE CERTIFICATE CertDecriptacao_advwrks WITH SUBJECT = 'Certificvado de decriptação para AdventureWorks'
go
USE AdventureWorks
GO
CREATE DATABASE ENCRYPTION KEY
WITH ALGORITHM = AES_128
ENCRYPTION BY SERVER CERTIFICATE CertDecriptacao_advwrks
GO
ALTER DATABASE AdventureWorks
SET ENCRYPTION ON
GO

A good way to work with it is to make some tests in an isolated environment first and them pass it to production in specific databases. It is pretty clear that the encryption process impacts on the overall performance by creating some cpu overhead.

So … have fun!!

quarta-feira, 13 de janeiro de 2010

4NF 5NF and 6NF

Follows a post i wrote on Brazilian Technet, later i will translate it to english.

 

4FN Quarta forma normal
“Uma tabela está na 4FN, se e somente se, estiver na 3FN e não existirem dependências multivaloradas”
Pode-se ver como a transformação de algo assim
Supondo que cada paciente pode ter mais de um plano de saúde, e pode fazer vários exames
Paciente   Plano      Exame
Carlos      SuaVida       Hemograma
Carlos      SL               Hemograma
Carlos      SuaVida       Urina
Carlos      SL               Hemograma

A idéia da 4FN seria dividir a informação em duas tabelas distintas
Uma com a Pedido do exame
e outra com a infromação de quais os planos q o paciente suporta, isso evitaria a redundência dessa informação em cada solicitação.
Paciente    Exame
Carlos       Hemograma
Carlos       Urina
Paciente PLano
Carlos    SuaVIda
Carlos    SL

A quinta forma é chamada de Forma de Projeção e União (Project/Join Normal Form)
Considere o exemplo
Vendedor   Marca       Tipo de Produto
Carlos        MArba      Aspirador de pó
Carlos        MArba      Torradeira
João          Tabajara   Geladeira
João          Tabajara   Torradeira
Arthur        Capivara   Aspirador de Pó
Arthur        Capivara   Tanquinho
Arthur        Capivara   Geladeira

Agora, suponha a seguinte regra: um vendedor tem um determinado conjunto de marcas e certo tipo de produtos na sua carteira de vendas.
Então poderiamos ter a tabela dividida da seguinte forma
Vendedor   Tipo de Produto
Carlos        Aspirador de pó
Carlos        Torradeira
João           Geladeira
João           Aspirador de Pó
Arthur        Tanquinho
Arthur        Geladeira

Numa outra tabela as marcas q o vendedor vende

Vendedor    Marca
Carlos        Marba
Joao           Tabajara
Arthur         Capivara

E logicamente numa outra tabela a lista de Produtos e a qual marca ele pertence

Produto               MArca
Aspirador de Pó   Marba
Aspirador de Pó   Tabajara
Torradeira           Marba
Torradeira           Tabajara
Geladeira            Tabajara
Geladeira            Capivara
Tanquinho           Capivara

A 6FN ela diz que os bancos de dados são uma extensão da álgebra relacional, existe um livro de Crhistopher J. Date e outros sobre Bancos de Dados temporais. Esse estudo pelo que eu vi é bem teórico, tem alguma aplicações práticas em cenários de DataWarehouse, existe alguma literatura do Date, e de um outro autor chamado Zimanyi, Agregações Temporais.
Espero que ajude, em poucas linhas é complicado definir, recomendaria procurar alguma literatura especializada em modelagem de dados. Outra coisa, lembre das implicações práticas. A normalização excessiva provoca um aumento no número de tabelas uma vez que se cria conjuntos não redundantes de dados, porém mais conjuntos. Em alguns cenários como DataWarehouses, as regras de normalização são desrespeitadas, em função da modelagem de DW trabalhar com preceitos de modelagem dimensional, uma vez que a informação é entendida como um conjunto de fatos de negócio rodeados por dimensões desses fatos.

segunda-feira, 4 de janeiro de 2010

Android

Apart from the Sql server usual topics, i´d like to register that i´m really impressed with the development of android plattform. you can find application for every task you want on the market, and they have a live community writing apps and earning money with that. It´s amazing the number of phones being shipped with the android os inside.. and what makes me more marvelled is how easy the operating system is to deal with.

I would say that even for a larger device, with this interface and plattform, android can grow to bigger devices in a near future, like ebook readers (let´s say nook from B&N) through tablet devices.

Android can represent the achievement that linux could not offer to the end user. Instead of fighting against Microsoft like linux and trying to create something better, Android is making it´s own way, which is awesome.

Microsoft a few years ago, was on the top about operating system technology, but somehow lost the track. I´m not an IPhone user neither an apple fan boy, but i have to agree that usability of the Iphone made everyone think about moving applications and making it more usable on the phone, and even more.. making its own phones with a better interface. Android could achieve that by creating an operating system open to multiple hardware platforms and using open standards all over it.

Google´s next step is Chrome OS, which can´t cause a huge impact on Windows empire… but again the innovation is on their side.More and more, the applications will become web based… and being such way, doesn´t matter if you are running Windows, Linux, Android, Leopard, Snow Leopard, Mountain Lion or anything… what do matter is what services can i bring to my browser, because every application would be a webpage.

More and more the strategies of those big companies would involve services, like google … like OVI from Nokia, like buying content with Itunes and so many others.

Big computing clouds will surround us, with content stored somewhere, published by someone, hosted by some machine in some datacenter around the world… but still it doesn´t matter, what matter is the content you receive, the physical location is not relevant because all the intelligence is stored somewhere, and your browser has just to render it before your eyes.

What we do, regarding SQL Server it is an important part of this future, because no matter what you do, the data must be stored somewhere, and must be accesible and secure. Must be quick to access, and must be reliable… the applications must work, webservices must be exposed. Services must be listening queues in order to process transactions, all this work is paramount to this future be viable.

When i look a small device carring a smart operating system on that i see i good idea, a lot of people working on it and a huge infrastructured to make it work.

 

Thanks

 

 

 

Congrats Android team.

About SQL Server Performance Parameters

Logical Disk: Disk Transfers/sec. This counter provides the overall throughput on the specific disk. Use this counter to monitor growth trends and forecast appropriately.

Logical Disk: Disk Read Bytes/sec & Disk Write Bytes/sec. This counter provides a measure of the total bandwidth for a particular disk.

Logical Disk: Average Disk sec/Read (Read Latency). This counter indicates the time it takes the disk to retrieve data. On well-tuned I/O subsystems, ideal values are 1-5 ms for logs (ideally 1 ms on a cached array), and 4-20 ms for data (ideally below 10 ms). Higher latencies can occur in peak times, but if high values are occurring regularly, investigate the cause.

Logical Disk: Average Disk sec/Write (Write Latency). This counter indicates the time it takes the disk to write the data. On well-tuned I/O subsystems, ideal values would be 1-5 ms for log (ideally 1 ms on a cached array), and 4-20 ms for data (ideally below 10 ms). Higher latencies can occur in peak times, but if high values are systematically occurring, investigate the cause.

Logical Disk: Average Disk Byte/Read. This counter indicates the size of I/Os being read. This value may affect disk latency and larger I/Os may result in slightly higher latency. When used to monitor SQL Server, this tells you the average size of the I/Os SQL Server is issuing.

Logical Disk: Average Disk Byte/Write. This counter indicates the size of I/Os being written. This value may affect disk latency and larger I/Os may result in slightly higher latency. When used to monitor SQL Server, this will tell you the average size of the I/Os SQL Server is issuing.

Physical Disk: % Disk Time: DataDrive. Monitor this counter to ensure that it remains below two times the number of disks.

Logical Disk: Current Disk Queue Length. For this counter, lower values are better. Values above 20 may indicate a bottleneck in the request waiting to be served by the disk, and should be investigated. Bottlenecks can create a backlog that may spread beyond the current server accessing the disk and result in long wait times for end users. Possible solutions to a bottleneck may be to add more disks to the RAID array, replace with faster disks, or move some of the data to other disks.

Logical Disk: Average Disk Queue Length. This counter indicates the average number of outstanding I/O requests. The general rule is that you should be at two or fewer outstanding I/O requests per spindle, but this may be difficult to measure due to storage virtualization, differences in RAID levels between configurations. Look for higher than average disk queue lengths in combination with higher than average disk latencies. This combination could indicate that the storage array cache is being over utilized or that spindle sharing with other applications is affecting performance.

Logical Disk: Average Disk Reads/Sec and Logical Disk: Average Disk Write/Sec.  These counters indicate the rate of read and write operations on the disk. Monitor these counters to ensure that they remain below 85 percent of the disk capacity. Disk access time increases exponentially if reads or writes are more than 85 percent of disk capacity. To determine the specific I/O capacity for your hardware, refer to the vendor documentation, or use the SQLIO disk subsystem benchmark tool to calculate it. For more information, see SQLIO Disk Subsystem Benchmark Tool (http://go.microsoft.com/fwlink/?LinkID=105586&clcid=0x409).

When you are using RAID configurations with the Logical Disk: Average Disk Reads/Sec or Logical Disk: Average Disk Write/Sec counters, use the formulas listed in the following table to determine the rate of I/Os on the disk.

RAID level

Formula

RAID 0

I/Os per disk = (reads + writes) / number of disks

RAID 1

I/Os per disk = [reads + (2 * writes)] / 2

RAID 5

I/Os per disk = [reads + (4 * writes)] / number of disks

RAID 10

I/Os per disk = [reads + (2 * writes)] / number of disks

For example, if you have a RAID 1 system with two physical disks, and your counters are at the values shown in the following table:

Counter

Value

Average Disk Reads/Sec

80

Average Disk Write/Sec

70

Average Disk Queue Length   

5

The I/O value per disk can be calculated as follows:

(80 + (2 * 70))/2 = 110

The disk queue length can be calculated as follows:

5/2 = 2.5

In this situation, you have a borderline I/O bottleneck.

Monitor disk latency and analyze trends. The amount of I/O and latency specific to SQL Server data files can be found by using the sys.dm_io_virtual_file_stats dynamic management view in SQL Server 2005. For more information, see sys.dm_io_virtual_file_stats (http://go.microsoft.com/fwlink/?LinkID=105587&clcid=0x409).

 

 

Credits for the PSS

Grouping Sets

This is a feature on SQL 2008, it is meant to allow multiple groupings within the same statement.

It could make this

 

SELECT customer, NULL as year, SUM(sales) FROM T GROUP BY customer

UNION ALL

SELECT NULL as customer, year, SUM(sales) FROM T GROUP BY year

 

into this

SELECT customer, year, SUM(sales)
FROM T
GROUP BY GROUPING SETS ((customer), (year))

 

(samples from MSDN website)

Sparse Columns and Sparse Property

This could be interesting to save space in case we have a lot of records that could be filled or null.

 

Take a look at the books online excerpt

Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve nonnull values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent. Sparse columns and column sets are defined by using the CREATE TABLE or ALTER TABLE statements.

Sparse columns can be used with column sets and filtered indexes:

  • Column sets
    INSERT, UPDATE, and DELETE statements can reference the sparse columns by name. However, you can also view and work with all the sparse columns of a table that are combined into a single XML column. This column is called a column set. For more information about column sets, see Using Column Sets.
  • Filtered indexes
    Because sparse columns have many null-valued rows, they are especially appropriate for filtered indexes. A filtered index on a sparse column can index only the rows that have populated values. This creates a smaller and more efficient index. For more information, see Filtered Index Design Guidelines.

Sparse columns and filtered indexes enable applications, such as Windows SharePoint Services, to efficiently store and access a large number of user-defined properties by using SQL Server 2008.

if you have the books online installed, take a look at the full article.

have fun

ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_1devconc/html/ea7ddb87-f50b-46b6-9f5a-acab222a2ede.htm

 

 

USE AdventureWorks
GO

CREATE TABLE DocumentStore
(DocID int PRIMARY KEY,
Title varchar(200) NOT NULL,
ProductionSpecification varchar(20) SPARSE NULL,
ProductionLocation smallint SPARSE NULL,
MarketingSurveyGroup varchar(20) SPARSE NULL ) ;
GO

INSERT DocumentStore(DocID, Title, ProductionSpecification, ProductionLocation)
VALUES (1, 'Tire Spec 1', 'AXZZ217', 27)
GO

INSERT DocumentStore(DocID, Title, MarketingSurveyGroup)
VALUES (2, 'Survey 2142', 'Men 25 - 35')
GO

DBCC Statements

This one is cool… i found it on the internet

 

Shows the list of DBCC commands.  then
      DBCC HELP(<command>)
Shows the syntax of an individual co

If you run DBCC HELP on all the commands you end up with this list:

DBCC activecursors [(spid)]

DBCC addextendedproc (function_name, dll_name)

DBCC addinstance (objectname, instancename)

DBCC adduserobject (name)

DBCC auditevent (eventclass, eventsubclass, success, loginname
, rolename, dbusername, loginid)

DBCC autopilot (typeid, dbid, tabid, indid, pages [,flag])

DBCC balancefactor (variance_percent)

DBCC bufcount [(number_of_buffers)]

DBCC buffer ( {'dbname' | dbid} [, objid [, number [, printopt={0|1|2} ]
[, dirty | io | kept | rlock | ioerr | hashed ]]])

DBCC bytes ( startaddress, length )

DBCC cachestats

DBCC callfulltext

DBCC checkalloc [('database_name'[, NOINDEX | REPAIR])]
[WITH NO_INFOMSGS[, ALL_ERRORMSGS][, ESTIMATEONLY]]

DBCC checkcatalog [('database_name')] [WITH NO_INFOMSGS]

DBCC checkconstraints [( 'tab_name' | tab_id | 'constraint_name' | constraint_id )]
[WITH ALL_CONSTRAINTS | ALL_ERRORMSGS]

DBCC checkdb [('database_name'[, NOINDEX | REPAIR])]
[WITH NO_INFOMSGS[, ALL_ERRORMSGS]
[, PHYSICAL_ONLY][, ESTIMATEONLY][,DBCC TABLOCK]

DBCC checkdbts (dbid, newTimestamp)]

DBCC checkfilegroup [( [ {'filegroup_name' | filegroup_id} ]
[, NOINDEX] )] [WITH NO_INFOMSGS
[, ALL_ERRORMSGS][, PHYSICAL_ONLY][, ESTIMATEONLY][, TABLOCK]]

DBCC checkident ('table_name'[, { NORESEED | {RESEED [, new_reseed_value] } } ] )

DBCC checkprimaryfile ( {'FileName'} [, opt={0|1|2|3} ])

DBCC checktable ('table_name'[, {NOINDEX | index_id | REPAIR}])
[WITH NO_INFOMSGS[, ALL_ERRORMSGS]
[, PHYSICAL_ONLY][, ESTIMATEONLY][, TABLOCK]]

DBCC cleantable ('database_name'|database_id, 'table_name'|table_id,[batch_size])

DBCC cacheprofile [( {actionid} [, bucketid])

DBCC clearspacecaches ('database_name'|database_id,
'table_name'|table_id, 'index_name'|index_id)

DBCC collectstats (on | off)

DBCC concurrencyviolation (reset | display | startlog | stoplog)

DBCC config

DBCC cursorstats ([spid [,'clear']])

DBCC dbinfo [('dbname')]

DBCC dbrecover (dbname [, IgnoreErrors])

DBCC dbreindex ('table_name' [, index_name [, fillfactor ]]) [WITH NO_INFOMSGS]

DBCC dbreindexall (db_name/db_id, type_bitmap)

DBCC dbrepair ('dbname', DROPDB [, NOINIT])

DBCC dbtable [({'dbname' | dbid})]

DBCC debugbreak

DBCC deleteinstance (objectname, instancename)

DBCC des [( {'dbname' | dbid} [, {'objname' | objid} ])]

DBCC detachdb [( 'dbname' )]

DBCC dropcleanbuffers

DBCC dropextendedproc (function_name)

DBCC dropuserobject ('object_name')

DBCC dumptrigger ({'BREAK', {0 | 1}} | 'DISPLAY' | {'SET', exception_number}
| {'CLEAR', exception_number})

DBCC errorlog

DBCC extentinfo [({'database_name'| dbid | 0}
[,{'table_name' | table_id} [, {'index_name' | index_id | -1}]])]

DBCC fileheader [( {'dbname' | dbid} [, fileid])

DBCC fixallocation [({'ADD' | 'REMOVE'},
{'PAGE' | 'SINGLEPAGE' | 'EXTENT' | 'MIXEDEXTENT'}
, filenum, pagenum [, objectid, indid])

DBCC flush ('data' | 'log', dbid)

DBCC flushprocindb (database)

DBCC free dll_name (FREE)

DBCC freeproccache

dbcc freeze_io (db)

dbcc getvalue (name)

dbcc icecapquery ('dbname', stored_proc_name
[, #_times_to_icecap (-1 infinite, 0 turns off)])
Use 'dbcc icecapquery (printlist)' to see list of SP's to profile.
Use 'dbcc icecapquery (icecapall)' to profile all SP's.

dbcc incrementinstance (objectname, countername, instancename, value)

dbcc ind ( { 'dbname' | dbid }, { 'objname' | objid }, { indid | 0 | -1 | -2 } )

DBCC indexdefrag ({dbid | dbname | 0}, {tableid | tablename}, {indid |indname})

DBCC inputbuffer (spid)

DBCC invalidate_textptr (textptr)

DBCC invalidate_textptr_objid (objid)

DBCC iotrace ( { 'dbname' | dbid | 0 | -1 }
, { fileid | 0 }, bufsize, [ { numIOs | -1 }
[, { timeout (sec) | -1 } [, printopt={ 0 | 1 }]]] )

DBCC latch ( address [, 'owners'] [, 'stackdumps'])

DBCC lock ([{'DUMPTABLE' | 'DUMPSTATS' | 'RESETSTATS' | 'HASH'}] |
[{'STALLREPORTTHESHOLD', stallthreshold}])

DBCC lockobjectschema ('object_name')

DBCC log ([dbid[,{0|1|2|3|4}[,['lsn','[0x]x:y:z']|['numrecs',num]|['xdesid','x:y']
|['extent','x:y']|['pageid','x:y']|['objid',{x,'y'}]|['logrecs',
{'lop'|op}...]|['output',x,['filename','x']]...]]])

DBCC loginfo [({'database_name' | dbid})]

DBCC matview ({'PERSIST' | 'ENDPERSIST' | 'FREE' | 'USE' | 'ENDUSE'})

DBCC memobjlist [(memory object)]

DBCC memorymap

DBCC memorystatus

DBCC memospy

DBCC memusage ([IDS | NAMES], [Number of rows to output])

DBCC monitorevents ('sink' [, 'filter-expression'])

DBCC newalloc - please use checkalloc instead

DBCC no_textptr (table_id , max_inline)

DBCC opentran [({'dbname'| dbid})] [WITH TABLERESULTS[,NO_INFOMSGS]]

DBCC outputbuffer (spid)

DBCC page ( {'dbname' | dbid}, filenum, pagenum
[, printopt={0|1|2|3} ][, cache={0|1} ])

DBCC perflog

DBCC perfmon

DBCC pglinkage (dbid, startfile, startpg, number, printopt={0|1|2}
, targetfile, targetpg, order={1|0})

DBCC pintable (database_id, table_id)

DBCC procbuf [({'dbname' | dbid}[, {'objname' | objid}
[, nbufs[, printopt = { 0 | 1 } ]]] )]

DBCC proccache

DBCC prtipage (dbid, objid, indexid [, [{{level, 0}
| {filenum, pagenum}}] [,printopt]])

DBCC pss [(uid[, spid[, printopt = { 1 | 0 }]] )]

DBCC readpage ({ dbid, 'dbname' }, fileid, pageid
, formatstr [, printopt = { 0 | 1} ])

DBCC rebuild_log (dbname [, filename])

DBCC renamecolumn (object_name, old_name, new_name)

DBCC resource

DBCC row_lock (dbid, tableid, set) - Not Needed

DBCC ruleoff ({ rulenum | rulestring } [, { rulenum | rulestring } ]+)

DBCC ruleon ( rulenum | rulestring } [, { rulenum | rulestring } ]+)

DBCC setcpuweight (weight)

DBCC setinstance (objectname, countername, instancename, value)

DBCC setioweight (weight)

DBCC show_statistics ('table_name', 'target_name')

DBCC showcontig (table_id | table_name [, index_id | index_name]
[WITH FAST, ALL_INDEXES, TABLERESULTS [,ALL_LEVELS]])

DBCC showdbaffinity

DBCC showfilestats [(file_num)]

DBCC showoffrules

DBCC showonrules

DBCC showtableaffinity (table)

DBCC showtext ('dbname', {textpointer | {fileid, pageid, slotid[,option]}})

DBCC showweights

DBCC shrinkdatabase ({dbid | 'dbname'}, [freespace_percentage
[, {NOTRUNCATE | TRUNCATEONLY}]])

DBCC shrinkfile ({fileid | 'filename'}, [compress_size
[, {NOTRUNCATE | TRUNCATEONLY | EMPTYFILE}]])

DBCC sqlmgrstats

DBCC sqlperf (LOGSPACE)({IOSTATS | LRUSTATS | NETSTATS | RASTATS [, CLEAR]}
| {THREADS} | {LOGSPACE})

DBCC stackdump [( {uid[, spid[, ecid]} | {threadId, 'THREADID'}] )]

DBCC tab ( dbid, objid )

DBCC tape_control {'query' | 'release'}[,('\\.\tape')]

DBCC tec [( uid[, spid[, ecid]] )]

DBCC textall [({'database_name'|database_id}[, 'FULL' | FAST] )]

DBCC textalloc ({'table_name'|table_id}[, 'FULL' | FAST])

DBCC thaw_io (db)

DBCC traceoff [( tracenum [, tracenum ... ] )]

DBCC traceon [( tracenum [, tracenum ... ] )]

DBCC tracestatus (trace# [, ...trace#])

DBCC unpintable (dbid, table_id)

DBCC updateusage ({'database_name'| 0} [, 'table_name' [, index_id]])
[WITH [NO_INFOMSGS] [,] COUNT_ROWS]

DBCC upgradedb (db) DBCC usagegovernor (command, value)

DBCC useplan [(number_of_plan)]

DBCC useroptions DBCC wakeup (spid)

DBCC writepage ({ dbid, 'dbname' }, fileid, pageid, offset, length, data)