New Features and Bug fixes in Version 1.10
Batch and Data Mesh:
- Domain flow - User group changes
In the prior sprint, we have a user group flow as part of domain flow, like projects. As per new specifications, we have a fixed set of Roles/Persona. Now, the flow has been modified so that users can be assigned roles directly.
- BM- Changes to Uni-source and Multi-source
In prior sprints, business models were categorized as Local, Remote, and Hybrid. As per updated requirements they have been categorized as Uni-Source and Multi-Source and respective changes were made in the Business Modeler flow.
- Data Product - Live and refresh changes with scheduling and sessions
In the data product, Live and Cache categorization was introduced. Live products will provide the data always by querying the underlying source system in real-time. Quality metrics will be refreshed as per the scheduling options provided by the user. Cached products both Data and Quality metrics are cached/refreshed as per scheduling options provided by the user, so when queried data it is retrieved from the last cached snapshot. Session history is maintained for each refresh as per schedule.
- Data Marketplace - My Products view
This view is for users to explore all the Subscribed, Rejected, Bookmarked, etc... products. The user experience in aspects of Search, Filter, and Sort would be the same as Marketplace. This view has been implemented in the current sprint.
- Data Marketplace- Access Requests
This view is enabled only for business owners, where he/she would be able to view all the subscription requests and review them to accept or reject the request. This view has been implemented in the current sprint.
- Domain, Business Modeler, Data Product - Initial Demo feedback
The feedback items provided by product owners on the marketplace have been implemented.
- Trino to Spark SQL
The processing engine for the Business Modeler has been switched from Trino to Spark.
Streaming and BSP:
- Kafka target
Now Kafka target supports different format types like CSV, JSON, and AVRO.
- Streaming bug fixes
136 bugs were fixed in this sprint.
- Mongo DB source native spark connector integration
In DataFactory, we previously used Mongo JDBC jar to connect Mongo DB source for real-time streaming. As we now moved to spark Mongo DB connector jar, we natively connect Mongo DB in streaming (Mongo DB as a source in the streaming pipeline).
DATA WRANGLER:
Added SQL Query feature.
AI-ML:
- Added a feature to balance, imbalanced classes with oversampling or undersampling
- Select feature, treat outliers, and transform screens functionality is available in one screen