New Features and Bug fixes in Version 1.9
Batch
Data profiling - Descriptive stats - Visual view with drill down
In the prior sprint, the configuration and execution of descriptive stats are completed. In the current sprint, the following has been completed:
1. Ability to visually plot the metrics
2. Ability to drill down on a specific column to get the distribution charts
3. Ability to drill down on a specific metric to view the data contributing
Feedback on the UX is received, and changes will be implemented in the upcoming sprints.
Domain flow
Introduced Domain: Domain is a way to group the assets in the data marketplace.
- We will be renaming the domain, that we had in the My Projects flow as "Functional Area"
- We completed the create and edit flow of the domains in the current sprint
- We do not have RBAC for the domains in the current sprint, it will be implemented in the upcoming sprint
Streaming
- Data-specific Types in Target for Flatten transformation as Source Kafka: Capturing the data types from the Kafka topic and carrying forward all the way to target
- Redshift Target with native spark redshift connector has been implemented
BSP (QUERY BASED)
- Snowflake and Redshift as Source and Target are implemented
Visual Logs Changes in BSP:
- After updating the backend tech stack (Confluent Kafka to 7.3.1) version for BSP. Required backend changes are done and tested
- The Cancel button is added on the loader page while starting a BSP Pipeline
- For Oracle Log-based source, we are displaying Bin logs status as enabled or disabled
- While creating a BSP pipeline, changing any existing configured source table details input displays a Confirmation pop-up before changing the Catalog and Schema
Data Market Place:
Business Modeler (Local and Remote):
- We have initiated the work of the Business modeler in the prior sprint
- In the current sprint, we have completed the first version of the Business modeler with the ability to connect to the local and limited remote connections and create business models
- Business names and description recommendation solution Ver 1 is also implemented in the current sprint
Data Product
- We have initiated the work of data product phase-1 in the current sprint
- Design and basic integration are completed in the current sprint. We would be able to have the MVP on this by the next sprint
Jobs
CICD is completed in this sprint.
DATA WRANGLER:
- Pagination functionality on the grid with filters and sorting is implemented
- Data domain and data type list is updated
- After sampling the data three options were added
- Edit recipe config
- Start wrangling
- Recipe details
- Filter data based on data quality metrics
- Custom remove/replace equals condition is added
- Custom remove/replace REGEX info is added
AI-ML:
- Pagination functionality on the grid with filters and sorting is implemented
- Added bread crumb to navigate to the project dashboard and project details of the current project
- Replaced all the phase loaders with sectional loaders to the required section of the screen
- Added dataset option at the prediction screen
- Added group by date functionality in forecasting flow
Projects Dashboard:
-
- Added details on project dashboard in ML models, like metric score, problem type, and target column
- Added delete functionality for ML models
- Added details on project dashboard in datasets, like create on, created By, last modified on, and last modified by