It’s been a while since I have blogged. But, better late than never. I wanted to record my new knowledge on DataFlux which was entirely new to me before Tuesday. Dataflux has an office here in the triangle and was acquired by SAS in the year 2000. The current version of the Data Management Studio is 2.2 The lifecycle consists of Plan (define/discover), Act (design/execute), Monitor (Evaluate/Control). The different components in the Dataflux products are the Dataflux Data Management Studio , Web Studio, Federated Server (outbound connections, virtualization, multiple data sources), Data Management Server (jobs, alerts, triggers). The data management repository consists of the rps file and the propitiatory files.
To create a new repository, go to the Administration riser on the left, find repository definition tree, right click and select new.
QKB (QualityKnowledge Base) – houses schemes, definitions, etc. ~ what names should look like. It can be found under repository definitions, should be set as default
The Data riser is where you can explore tables and ODBC connections.
Data Collections is where you group like fields
Field Match, Table Match, Identification Analysis
Higher sensitivty, less likely match
Data profiling – look for errors with frequencies and patterns
Analysis report – what is right
create a standardization scheme
phrase or identification analysis