Tag Archives: SQL Server 2012

SQL Server 2012: Data Quality Services

With the release of SQL Server 2012, I am looking more into Master Data Services (MDS) and Data Quality Services (DQS). A brief overview of DQS.

You install DQS with SQL, and you have to configure it. The server configuration is a cmd line process that runs to create some databases on your server (DQS_MAIN, DQS_PROJECTS, DQS_STAGING_DATA).

I ran into one issue with the running of the configuration, not sure if this happens everywhere, I am running Windows 8, but nonetheless, I ran into. After running the tool and getting error after error, and trying as admin, etc. I dug deeper into the error message and found that I there were some security/permission issues I had to resolve. It ended up being that I had to change permissions on

C:WindowsMicrosoft.NETFramework64v4.0.30319Configmachine.config

to allow write access. Once I did that, the configuration tool worked and I could get into DQS.

DQS gives you a “Data Quality Services Client” to work with. When you open it, connect to the database where you configured the three databases I talk about above. Once you, you have 3 panes.

You have Knowledge Bases, Data Quality Projects, and Administration.

Knowledge Bases: datasets of known data that you can use in your Data Quality Projects. You get a default Knowledge Base – state names and some other data similar to that.

Data Quality Projects: Here is where the magic happens. You can choose some source data (Excel xls – xlsx wouldn’t work or SQL table) and then apply your knowledge base on it. Then you can reimport your data at the end back into SQL or export it, and update your Knowledge Base with learned values.

Administration: Not a ton of options, but you can set some thresholds, and also setup your Azure data market settings.

Azure Data Market https://datamarket.azure.com/browse/Data?Category=dqs – Lots of data you can use to combine with your Knowledge Bases. Much more here and I won’t go into detail – it could be its own post in itself.

As a test, I took an excel file, added a few records with columns first, last, city, state (I actually imported into a staging SQL table to work with it) – But in the state field I put different variations of state, WI, Wis, Wisconsin, MN, Minn, Minn., etc.

I then ran the file through creating a new data quality project and ran it against the default Knowledge Base, and it corrected the values it could. Got a weird error clicking next on the project, it seems the button is touchy. Hopefully they come out with a fix soon.

Once you build up and get your Knowledge Base stable, you can use from SSIS packages or in Master Data Services. I see many useful applications for DQS. Either around your corporate data or pulling in data from Azure data market to cleanse existing data you might have (think: looking up gender from first/last name).

This post is a brief look at DQS and how it works, but there is so much more. I hope to get more in depth in the near future.

Advertisements

Reflecting on 1000 Posts

First, I want to give an update on what’s up recently. First off, work is busy as ever. SQL 2012 RTM’d, so that is cool on the Business Intelligence side, and on the Dev side, working on moving a desktop app to the cloud, Azure specifically. Many things going on, as well as keeping on the SharePoint and Yammer trains. Enterprise Social is just “starting” and there is much more happening. Windows 8 is still good. Some quirks but very good, excited for the RC and RTM versions.

It has been a while since I have posted, for good reason, but of course it usually boils down to “just time”. This is my 1000th post on this blog. Well, not really, I had some that I probably removed over the years. Back since 2004, I have moved this blog from blogger.com, to wordpress, to my own wordpress (on windows!) to different providers and hosters and what not. It has been a road. And more to come. I have a few drafts queued up, but was waiting to have something good for post 1000.

So what else? Other things are keeping me busy. Since last October, Homebrewing has become a big hobby of mine.

We have done 8 batches so far, and it keeps getting better and more fun. We are now past kits and we now are picking our ingredients, which is cool. The beer is good too :)

MADPASS (www.madpass.org) – I am on the board with MADPASS, we hit our 1 year mark in Feb, which was cool.

We keep doing cool new things, the speakers are good, and the “community” is growing. SQL Saturday 118 is coming up on April 21st as well in Madison (at MATC) – and I am helping out with that as well.

Besides that? Always the same old tech things I do every day. Finally consolidated my Google life to my Google Apps (more on that in a future post). No, no iPad 3/Next for me yet. Maybe though. Thinking about my next tech gadget. Chromebook? New TV? Not sure yet. Nothing really pressing there.

What else could there be? Well… a couple of things :)

I have been in the Madison area since November 2007. Been at Trek since Sept 2008. I have lived in Madison (apts), Sun Prairie (condo) and now in an apartment downtown Sun Prairie. It is a cool place. Nice view. Right in the thick of everything that goes on downtown (parades, festivals, etc). But.. the apt life wears on you. We have been “looking” for a house for a while now. Not ready to just go pick one out that is on the market, but wait for the right one. Right area, right price, right look, etc. Well, we found one. In less than 2 weeks, I will be moving to a house in Sun Prairie. It is about 6 blocks away from downtown, so within walking distance of everything (cough cough – Eddies), but yeah, nice little place. In a semi-secluded street. 3 bed, 1.5 Bath, Nice fenced in yard. Should be a nice house.

But wait, there’s more!

Now, I would say that I have everything I need to be happy, and that is true. But sometimes things happen and you get things in your life that will even make you *more* happy. Well, it is a good thing the new house is a 3 bedroom, because come fall there will be another baby Novoselac running around. Yes! We are expecting. Ella will be a big sister and of course our lives will change for the better once again. I am excited but will be even more excited around October!

So until then, and even after, we will keep watching the movie that we call life and having fun.

Ok. 1000 down. couple more thousand to go. Keep rockin’ everyone.

#sqlpass 2011 Summit Thoughts

Want to get a quick post out before I forget everything cool I have been thinking about the 2011 SQL Pass Summit in Seattle WA.

The conference started out great with a first timer reception (this was my second year), and I knew a few first timers so that would great.

The first day keynote was good, and then the sessions I hit were on a full range of things, new semantic search stuff with file table, and more.

I missed a couple of sessions on day 1 because I was in a Microsoft led focus group on “BI in the cloud” some very cool ideas thrown out and excited to see what is coming up in the future.

More good keynote on day two, a deeper dive into the stuff in SQL Server 2012 (formerly Denali). Went to some good sessions on Data Quality Services (DQS) and Vertipaq vs OLAP, and Power View (remember the space!) SharePoint and Power View integration, etc. Some very cool stuff coming soon with the 2012 release.

Tabular cubes vs Multidimensional cubes. OLAP vs Vertipaq, etc. Big debates and questions here, when to put things in Tabular vs MDX, etc. DAX solves some hard problems we run into with MDX.

Another thing I went to a couple sessions on was StreamInsight. This is some very cool technology for complex event processing, using .NET and LINQ. The StreamInsight guys are crazy smart. Also integrating StreamInsight with Azure, and doing some crazy processing and analysis of tons of data in near real time. I see this technology making a difference somewhere down the road.

Power View was another big winner from the summit. This integrates with SharePoint 2010 as a Silverlight (SL still isn’t dead?) report builder and viewer. Not even sure they are called “reports” anymore, but “views”. You create views and you can interact/edit them in real time. Export to PowerPoint too. These views work off “tabular” vertipaq BISM (BI semantic model) cubes. This technology is going to change how we deliver information to the business.

Being able to take a PowerPivot (no space!) and reverse engineer it into SQL Server Data Tools (formerly BIDS – in VS2010 shell) to create a BISM tabular cube and publish to SSAS Tabular instance looks great. Solving problems like crazy snapshotting and many to many and time analysis using DAX looks very useful.

Of course, GameWorks appreciation night was fun, and I found a Mcmenamins (Six Arms) up the road from the convention center, so I got get a taste of Ruby again. Good conversation and times at the Taphouse and around the Sheraton and Convention center was good too. Seattle is a nice city and the weather was pretty good I thought. Another great summit.