Immuta, the provider of an interoperable data security platform, has recently announced integrations and updates aimed at bolstering security for Databricks. Databricks, a software company founded by the creators of Apache Spark, offers data warehouse and data lake solutions.
In recent years, Databricks has introduced an innovative architectural approach, combining data warehouses and data lakes into the new “lakehouse” model. With substantial investments in artificial intelligence (AI), more advanced security measures are imperative.
This is where Immuta comes into play. Founded in 2015, Immuta is a leading data security platform provider in North America, specializing in AI workflow protection. Their cloud-native data governance tools enable enhanced security, easier identification of sensitive data, and stronger access controls.
The June 2023 update brings the capabilities of Immuta’s platform to Databricks’ customers.
About the Immuta and Databricks Integration
Immuta and Databricks strengthened their partnership in May of this year when Databricks Ventures, the VC arm of the lakehouse company, made a significant investment in security. While the exact amount remains undisclosed, Immuta stated that the funding would be utilized for product development.
The close collaboration between the two companies dates back to the spring of 2018 when Immuta introduced new tools for Apache SparkSQL. Since then, the partnership has evolved, culminating in this latest upgrade. Key features include access control for AI workload protection and localized discovery of sensitive data.
The Need for AI Workload Protection
In 2023, AI workload protection has become a critical necessity. As the cloud triggered a digital revolution, artificial intelligence is now witnessing a similar surge in adoption. More than 90% of developers use AI tools at work, and approximately 77% of companies employ multiple third-party tools for AI workloads.
This makes systems like Databricks vulnerable to security attacks and breaches. Databricks is making a significant AI play, intending to help customers reduce costs and expedite innovation by unifying their data, analytics, and AI on one cloud platform. However, without adequate AI workload governance, this could create a significant threat vector.
As a result, the company has recently ramped up its cybersecurity efforts. In addition to Immuta’s crucial upgrades to its data security platform, Databricks has also partnered with security companies BigID, Theom, and Hunters this year.
(Download Whitepaper: Protecting your Data End-to-End)
Immuta’s Enhanced Data Security Platform
The recent update from Immuta introduces several key changes:
- Native Integration with Databricks Unity Catalog: The Databricks Unity Catalog organizes structured and unstructured data as well as machine learning models through a three-level namespace. Immuta’s native integration facilitates secure data access, detection of access issues, and discovery of sensitive information.
- Security Policy Enforcement: Immuta enables smarter security policies and streamlines their enforcement. Access rights at the Unity Table level can be granted or revoked through native policy enforcement.
- Centralized Metadata Management: Immuta centralizes metadata management for information stored in Databricks lakehouses. This allows for policy automation and orchestration and simplifies the discovery of sensitive data.
- User Activity Monitoring: Immuta enables monitoring and recording of all user activity on Databricks, including audit logs of user queries and access changes. The Unified Audit Model (UAM) ensures consistent log structure and metadata across Unity Catalog and the main Immuta instance, simplifying security measures.
- Access Control Policies: Immuta introduces automation to the Databricks environment by automatically creating multiple policies based on a single high-level intent, significantly reducing manual effort in defining access control policies. The company claims using these new integrations can reduce the number of roles and policies to manage by 93X.
Strengthening AI Workload Protection
While AI workload protection is not the sole purpose of the Immuta/Databricks integration, it is a primary use case. As artificial intelligence scales, there is a growing need to migrate AI data to the cloud. With the Databricks Unity Catalog now secured, tasks like filtering roles, masking columns, discovering sensitive data, and controlling AI data access can be carried out with greater confidence. Immuta also ensures compliance with regulations such as HIPAA and GDPR.
The data security platform adds another layer of trust to security analysis, helping prioritize risks effectively and setting up real-time alerts for severe events that threaten AI workload governance.
Business Benefits
The newly announced solution brings several benefits to customers:
- Improved Performance Without Compromising Security: The Databricks Lakehouse architecture is compatible with any cloud, and now customers can combine this freedom with enhanced security from Immuta. This allows migration of AI data assets to any cloud environment while ensuring superior performance without compromising security.
- Streamlined Collaboration and Data Sharing on Databricks: Immuta enables secure collaboration on Databricks, maintaining detailed audit trails with full transparency. Relevant users can access the centralized metadata store, delegate stewardship of policies, and define purpose-based access.
- Gaining a Competitive Edge with Top-Notch Data Security: Immuta brings cutting-edge security and AI workload protection to the Databricks environment. Unity Catalog users can fully utilize their data repositories, build new AI models, and solve complex business problems without concerns about security or compliance.
Customer Testimonials
Immuta’s expertise in managing data governance and AI workload protection has been instrumental in accelerating HIPAA-compliant AI and machine learning for clients like clinical research from Cognoa.
“Databricks helps us manage that data, and Immuta plays an important role in administering security and access control. As we look to innovate with new products and implement a multi-cloud strategy, we must treat the data properly – it must be governed,” said Jack Berkowitz, Chief Data Officer at ADP.
“Swedbank needed to build an enterprise-scale advanced analytics platform that would also enforce trust in our security, management, and access to data internally while protecting our customers’ assets and data. Immuta and Databricks have been instrumental in helping us build that vision, and we are excited to see their partnership go to the next level,” said Vineeth Menon, Head of Data Lake Engineering at Swedbank.
Conclusion
In the AI era, cloud data infrastructure comprises three layers: the data lake, the data warehouse, and the data exchange. Immuta and Databricks introduce a fourth pillar: data security in the cloud. As artificial intelligence necessitates handling information at an incredible scale and speed, three aspects are critical:
- Separating the Policy from the Platform
- Native, Not Retrofitted Cloud Data Controls
- Leveraging Attributes Instead of Roles to Tag Data
The new integration, with its Discover, Detect, and Secure components, enables better granularity and manageability for data stores, benefiting existing customers.