Monday, December 30, 2013

SQL Server table sample

Is is often useful when You work with databases to see the sample of the data, in order to understand the meaning of these data and the meaning of the table.
One of the way to understood the table (entity) is to see which columns(attributes) he has. Standard way to do that is to execute following query:

SELECT * FROM Person.Person WHERE 1 = 0

As a result of that query, You are going to get a list o table attributes.

From SQL Server 2005, there is a feature which gives user sample of the table data, through which user can understand that table (and entity it describes) better. SQL query which gives You sample data is:

SELECT FirstName,LastName
FROM Person.Person TABLESAMPLE (1 PERCENT)

In a case of AdventureWorks2008R2 database, as a result of that SQL statement You get the result with 170, 355, 389...results. So it means that resultset is not static and it changes.

Also, it is possible to run TABLESAMPLE command with approximate number of rows which should be returned in recordset:

SELECT FirstName, LastName
FROM Person.Person TABLESAMPLE (100 ROWS) ;

Thursday, December 26, 2013

Database build and versioning - part I

Database versioning is quite a difficult tasks. It become even more difficult when You need to upgrade database from one version to newer one and usually that newer version is several releases far.
I have been thinking and working in that field for several years and aquire and developed several possible solutions, and I am sure that there are more good solutions to be found.
The first thing is a need to have automatic database scripts build. Before that, company should have developed "nightbuild" system, but as the speed of development increase, "nightbuild" could be run several times a day.
During that "nightbuild" process, database scripts are built.
Another process which creates software installation picks these files and include them into installation.
As a part of software installation, these database scripts should be somehow executed in order to have database installed.
Every of these actions requires some attention and appropriate approach to be solved and developed.

I have experienced three types of database files:
1. database scripts in form of SQL
2. database backup
3. Microsoft database project.

All approaches were very interesting to develop and all they have some good parts and some bad parts. When I have been creating them I have always been asking myself - how easy they are going to be deployed and how I am going to take care what is deployed on some project and what on another, and how I am going to solve a problem of database versioning.

It is also good thing to be noted that the methodology how database installation files should be created should be distinguished of the process of installation.

For all these approaches process needs to be designed which needs to take into account the current company processes and to create database build process which is going to be somehow included in existing process, but the most important thing is that You need to have database files. In first instance, You can create them manually, but the process of installation build must pick something from some location. And the process of installation should install something. When the process is designed like that, in the time it will grow into more sophisticated, more automatic and will have bigger quality.


Wednesday, January 9, 2013

Database and disk size estimation

Since I am getting a lot of questions regarding estimation of database sizes, I have calculated that spending too much time to predict exact size of database is much more expensive that spend money to larger disk.
Many customers want to have exact size, and formula how we estimated database size. That estimation is directly connected with money, since when we give an estimation, customer is going to buy disks of such size.
So, in order to calculate how much time is profitable spending on this task, we need only two numbers - SAN price per GB(with all costs) and how much are worth hours of people who are included in process of database (and disk) size estimation.
Very rough estimation for SAN storage is varying between several dollars to maximum 40 dollars per GB.
Also, if You also include in calculation well known Pareto rule which said that with 20% of time You will reach 80% of You goal, it means that we do not have to spend too much time on disk size prediction - just tell our customer to buy big enough disk, and that if there is no enough space that we will need to buy more...If any of customers are willing to accept this explanation.. :-)