It’s been a while since I have blogged. But, better late than never. I wanted to record my new knowledge  on DataFlux which was entirely new to me before Tuesday.  Dataflux has an office here in the triangle and was acquired by SAS in the year 2000. The current version of the Data Management Studio is 2.2 The lifecycle consists of Plan (define/discover), Act (design/execute), Monitor (Evaluate/Control). The different components in the Dataflux products are the Dataflux Data Management Studio , Web Studio, Federated Server (outbound connections, virtualization, multiple data sources), Data Management Server (jobs, alerts, triggers).  The data management repository consists of the rps file  and the propitiatory files.

To create a new repository, go to the Administration riser on the left, find repository definition tree, right click and select new.

QKB (QualityKnowledge Base) – houses schemes, definitions, etc. ~ what names should look like. It can be found under repository definitions, should be set as default

The Data riser is where you can explore tables and ODBC connections.

Data Collections is where you group like fields

Field Match, Table Match, Identification Analysis

Higher sensitivty, less likely match

Data profiling – look for errors with frequencies and patterns

Analysis report – what is right

create a standardization scheme

phrase or identification analysis

February 24, 2012 Posted by | Uncategorized | Leave a comment