Oracle big data reference architecture

You can effectively collect and analyze event data and streaming data from internet of things (IoT) and social media sources, but how do you correlate it with the broad range of enterprise data resources to leverage your investment and gain the insights you want?

Leverage a cloud data lakehouse that combines the abilities of a data lake and a data warehouse to process a broad range of enterprise and streaming data for business analysis and machine learning.

This reference architecture positions the technology solution within the overall business context:

Description of data-driven-business-context.png follows

Description of the illustration data-driven-business-context.png

A data lake enables an enterprise to store all of its data in a cost-effective, elastic environment while providing the necessary processing, persistence, and analytic services to discover new business insights. A data lake stores and curates structured and unstructured data and provides methods for organizing large volumes of highly diverse data from multiple sources.

With a data warehouse, you perform data transformation and cleansing before you commit the data to the warehouse. With a a data lake, you ingest data quickly and prepare it on the fly as people access it. A data lake supports operational reporting and business monitoring that require immediate access to data and flexible analysis to understand what is happening in the business while it is happening.

Functional Architecture

This architecture combines the abilities of a data lake and a data warehouse to provide a modern data lakehouse platform that processes streaming data and other types of data from a broad range of enterprise data resources. Use this architecture to leverage the data for business analysis, machine learning, data services, and data products.

A data lakehouse architecture combines the capabilities of both the data lake and the data warehouse to increase operational efficiency and to deliver enhanced capabilities that allow:

Seamless data and information usage without the need to replicate it across the data lake and data warehouse
Diverse data type support in an enhanced multimodel and polyglot architecture
Governance and fine-grained data security that leverages a zero-trust security model
The ability to fully decouple storage and compute resources and to consume only the resources needed at any point in time
The ability to leverage multiple compute engines, including open source engines, to process the same data for different use cases to achieve maximum data repurposing, liquidity, and usage
The ability to leverage Oracle Cloud Infrastructure (OCI) native services that are managed by Oracle and that reduce operational overhead
Better cloud economics with autoscaling that adjusts cloud resources infrastructure to match the actual demand
Modularity so that service use is use-case driven
Interoperability with any system or cloud that adheres to open standards
Support for a diverse set of use cases including streaming, analytics, data science, and machine learning
Support for different architectural approaches, from a centralized lakehouse to a decentralized data mesh

The following diagram illustrates the functional architecture.

Description of lakehouse-functional.png follows

Description of the illustration lakehouse-functional.png

The architecture focuses on the following logical divisions:

Ingest, Transform Ingests and refines the data for use in each of the data layers in the architecture.
Persist, Curate, Create Facilitates access and navigation of the data to show the current business view. For relational technologies, data may be logically or physically structured in simple relational, longitudinal, dimensional or OLAP forms. For non-relational data, this layer contains one or more pools of data, either output from an analytical process or data optimized for a specific analytical task.
Analyze, Learn, Predict Abstracts the logical business view of the data for consumers. This abstraction facilitates agile approaches to development, migration to the target architecture, and the provision of a single reporting layer from multiple federated sources.

The architecture has the following functional components:

Batch ingest Batch ingest is useful for data that can't be ingested in real time or that is too costly to adapt for real-time ingestion. It is also important for transforming data into reliable and trustworthy information that can be curated and persisted for regular consumption. You can use the following services together or independently to achieve a highly flexible and effective data integration and transformation workflow.
- Oracle Cloud Infrastructure Data Integration is a fully managed, serverless, cloud-native service that extracts, loads, transforms, cleanses, and reshapes data from a variety of data sources into target Oracle Cloud Infrastructure services, such as Autonomous Data Warehouse and Oracle Cloud Infrastructure Object Storage . ETL (extract transform load) leverages fully-managed scale-out processing on Spark, and ELT (extract load transform) leverages full SQL push-down capabilities of the Autonomous Data Warehouse in order to minimize data movement and to improve the time to value for newly ingested data. Users design data integration processes using an intuitive, codeless user interface that optimizes integration flows to generate the most efficient engine and orchestration, automatically allocating and scaling the execution environment. Oracle Cloud Infrastructure Data Integration provides interactive exploration and data preparation and helps data engineers protect against schema drift by defining rules to handle schema changes.
- Oracle Data Integrator provides comprehensive data integration from high-volume and high-performance batch loads, to event-driven, trickle-feed integration processes, to SOA-enabled data services. A declarative design approach ensures faster, simpler development and maintenance, and provides a unique approach to extract load transform (ELT) that helps guarantee the highest level of performance possible for data transformation and validation processes. Oracle data transforms use a web interface to simplify the configuration and execution of ELT and to help users build and schedule data and work flows using a declarative design approach.
- Oracle Data Transforms enable ELT for selected supported technologies, simplifying the configuration and execution of data pipelines by using a web user interface that allows users to declaratively build and schedule data flows and workflows. Oracle Data Transforms is available as a fully-managed environment within Oracle Autonomous Data Warehouse (ADW) to load and transform data from several data sources into an ADW instance.
Depending on the use case, these components can be used independently or together to achieve highly flexible and performant data integration and transformation.
- Oracle Cloud Infrastructure FastConnect provides an easy way to create a dedicated, private connection between your data center and Oracle Cloud Infrastructure . FastConnect provides higher-bandwidth options and a more reliable networking experience when compared with internet-based connections.
- Oracle Cloud Infrastructure command line interface (CLI) allows you to execute and automate the transfer of data from on premises to OCI by leveraging the Oracle Cloud Infrastructure FastConnect private circuit. OCI SDKs allow you to write code to copy data and files from on premises or from other clouds into Oracle Cloud Infrastructure Object Storage , leveraging a variety of programming languages such as Python, Java or Go to name a few. REST APIs allow you to interface with and control OCI services, such as moving data into object storage by using the Object Storage Service API.
- Storage Gateway is a cloud storage gateway that enables connecting on-premises applications with OCI. Applications can write data to a colocated NFS target that will upload those files to OCI Object Storage without requiring application modifications to uptake the REST APIs.
- Oracle Cloud Infrastructure Data Transfer is an offline data migration service that lets you securely move petabyte-scale datasets from your data center to Oracle Cloud Infrastructure Object Storage or Archive Storage. Using the public internet to move data to the cloud is not always feasible due to high network costs, unreliable network connectivity, long transfer times, and security concerns. The Data Transfer service overcomes these challenges and can significantly reduce the time that it takes to migrate data to the cloud. Data Transfer is available through either Disk or Appliance. The choice of one over the other is mostly dependent on the amount of data, with Data Transfer Appliance supporting larger data sets for each appliance.
- Oracle Cloud Infrastructure Streaming provides a fully-managed, scalable, and durable storage solution for ingesting continuous, high-volume streams of data that you can consume and process in real time. Streaming can be used for messaging, high-volume application logs, operational telemetry, web click-stream data, or other publish-subscribe messaging model use cases in which data is produced and processed continually and sequentially. Data is synced to Oracle Cloud Infrastructure Object Storage and can be curated and further transformed to extract valuable insights.
- Oracle Cloud Infrastructure Service Connector Hub is a cloud message bus platform that offers a single pane of glass for describing, executing, and monitoring movement of data between services in Oracle Cloud Infrastructure. For this particular reference architecture it will be used to move data from Oracle Cloud Infrastructure Streaming into Oracle Cloud Infrastructure Object Storage to persist the raw and prepared data into the data lakehouse persistence layer.
- For batch and stream processing leveraging several popular open source engines such as Hadoop, Spark, Flink or Trino
- With Oracle Cloud Infrastructure Streaming both as a producer and as a consumer
- With Oracle Cloud Infrastructure Object Storage where it can both persist data and consume data
You can use Oracle Cloud Infrastructure Object Storage as a data lake to persist data sets that you want to share between the different Oracle Cloud Infrastructure services at different times.
- Oracle Cloud Infrastructure Data Integration , described above, is a fully-managed, serverless, cloud-native service that extracts, loads, transforms, cleanses, and reshapes data from a variety of data sources into target Oracle Cloud Infrastructure services, such as Autonomous Data Warehouse and Oracle Cloud Infrastructure Object Storage .
- Oracle Cloud Infrastructure Data Flow is a fully-managed big data service that lets you run Apache Spark applications without having to deploy or manage infrastructure. It lets you deliver big data and AI applications faster, because you can focus on your applications without having to manage operations. Data flow applications are reusable templates that consist of a Spark application and its dependencies, default parameters, and a default run-time resource specification.
- Data Science provides infrastructure, open source technologies, libraries, packages, and data science tools for data science teams to build, train, and manage machine learning (ML) models in Oracle Cloud Infrastructure . The collaborative and project-driven workspace provides an end-to-end cohesive user experience and supports the lifecycle of predictive models. Data Science enables data scientists and machine learning engineers to download and install packages directly from the Anaconda Repository at no cost and thus allowing them to innovate on their projects with a curated data science ecosystem of machine learning libraries. The Data Science Jobs feature enables data scientists to define and run repeatable machine learning tasks on a fully-managed infrastructure. The Data Science Model Deployment feature allows data scientists to deploy trained models as fully-managed HTTP endpoints that can provide predictions in real time, infusing intelligence into processes and applications, and allowing the business to react to relevant events as they occur.
- Oracle Machine Learning provides powerful machine learning capabilities tightly integrated in Autonomous Database , with support for Python and AutoML. It supports models using open source and scalable, in-database algorithms that reduce data preparation and movement. AutoML helps data scientists speed up time to value of the company’s machine learning initiatives by using auto algorithm selection, adaptive data sampling, auto feature selection, and auto model tuning. With Oracle Machine Learning services available in Oracle Autonomous Data Warehouse , you can not only manage models but you can also deploy those models as REST endpoints in order to democratize real-time predictions within the company allowing business to react to relevant events as they occur, rather than after the fact.
- Oracle Cloud Infrastructure Anomaly Detection provides a rich set of tools to identify undesirable events or observations in business data in real time so that you can take actions to avoid business disruptions.
- Oracle Cloud Infrastructure AI Language performs sophisticated text analysis at scale. With pretrained and custom models, developers can process unstructured text and extract insights without data science expertise. Pretrained models support sentiment analysis, key phrase extraction, text classification, and named entity recognition. You can also train custom models for named entity recognition and text classification with domain specific data sets. Translation service enables you to translate text across 21 different languages.
- Oracle Cloud Infrastructure Speech harnesses the power of spoken language by allowing you to easily convert media files containing human speech into highly accurate text transcriptions. OCI Speech can be used to transcribe customer service calls, automate subtitling, and generate metadata for media assets to create a fully searchable archive.
- Oracle Cloud Infrastructure Vision performs image recognition and document analysis tasks such as classifying images, detecting and faces, extracting text, and recognizing tables. You can either leverage pretrained models or easily create custom vision models for industry- and customer-specific scenarios. Vision service is a fully-managed, multitenant, native cloud service that helps with all common computer vision tasks.
- Oracle Cloud Infrastructure Document Understanding performs document analysis tasks such as extracting text and recognizing tables. OCI Document Understanding service is a fully-managed, multitenant, native cloud service that helps with all common document analysis tasks.
- Oracle Cloud Infrastructure API Gateway enables you to publish APIs with private endpoints that are accessible from within your network, and that you can expose with public IP addresses if you want them to accept internet traffic. The endpoints support API validation, request and response transformation, CORS, authentication and authorization, and request limiting. It allows API observability to monitor usage and guarantee SLAs. Usage plans can also be used to monitor and manage the API consumers and API clients that access APIs and to set up different access tiers for different customers in order to track data usage that is consumed by using APIs. Usage plans are a key feature to support data monetization.
- Oracle Cloud Infrastructure Functions is a fully-managed, multitenant, highly-scalable, on-demand, functions-as-a-service platform. It is built on enterprise-grade Oracle Cloud Infrastructure and powered by the Fn Project open source engine.
- Oracle REST Data Services (ORDS) is a Java application that enables developers with SQL and database skills to develop REST APIs for Oracle Database . Any application developer can use these APIs from any language environment, without installing and maintaining client drivers, in the same way they that they access other external services using REST, the most widely used API technology. ORDS is deployed as a fully-managed feature in ADW and can be used to expose lakehouse information by using APIs to data consumers.
- Oracle Data Safe is a fully-integrated Oracle Cloud service focused on data security. It provides a complete and integrated set of features for protecting sensitive and regulated data in Oracle Cloud databases, such as Oracle Autonomous Data Warehouse . Features include security assessment, user assessment, data discovery, data masking, and activity auditing.
- Oracle Cloud Infrastructure Audit provides visibility into activities related to Oracle Cloud Infrastructure (OCI) resources and tenancies. Audit log events can be used for security audits to track usage of and changes to OCI resources and to help ensure compliance with standards and regulations.
- Oracle Cloud Infrastructure Logging provides a highly-scalable and fully-managed single interface for all the logs in the tenancy, including audit logs. Use OCI Logging to access logs from all OCI resources so that you can enable, manage, and search them.
- Oracle Cloud Infrastructure Vault is an encryption management service that stores and manages encryption keys and secrets to securely access resources. Enables customer managed keys to be used for Oracle Autonomous Data Warehouse and data lake encryption for increased data protection at rest. Enables secrets to securely store services and user credentials to improve your security posture and to ensure credentials aren't compromised and used inappropriately.
Physical Architecture

The physical architecture for this data lakehouse supports the following:
- Data is ingested securely by using micro batch, streaming, APIs, and files from relational and non-relational data sources
- Data is processed leveraging a combination of Oracle Cloud Infrastructure Data Integration and Oracle Cloud Infrastructure Data Flow
- Data is stored in Oracle Autonomous Data Warehouse and Oracle Cloud Infrastructure Object Storage and is organized according to its quality and value
- Oracle Autonomous Data Warehouse serves warehouse and lake data services securely to consumers
- Oracle Analytics Cloud surfaces data to business users by using visualizations
- Oracle Analytics Cloud is exposed by using Oracle Cloud Infrastructure Load Balancing that is secured by Oracle Cloud Infrastructure Web Application Firewall (WAF) to provide access by using the internet
- Oracle Cloud Infrastructure Data Science is used to build, train, and deploy machine learning (ML) models
- Oracle Cloud Infrastructure API Gateway is leveraged to govern the Data Science ML model deployments
- Oracle Cloud Infrastructure Data Catalog harvests metadata from Oracle Autonomous Data Warehouse and object storage
- Oracle Data Safe evaluates risks to data, implements and monitors security controls, assesses user security, monitors user activity, and addresses data security compliance requirements
- Oracle Cloud Infrastructure Bastion is used by administrators to manage private cloud resources
The following diagram illustrates this reference architecture.

Description of the illustration lakehouse-architecture.png

The design for the physical architecture:
- Leverages 2 VCNs, one for hub and another for the workload itself
- On premises connectivity leverages both Oracle Cloud Infrastructure FastConnect and site-to-site VPN for redundancy
- All incoming traffic from on premises and from the internet is first routed into the hub VCN and then into the workload VCN
- All data is secure in transit and at rest
- Services are deployed with private endpoints to increase the security posture
- The VCN is segregated into several private subnets to increase the security posture
- Lake data is segregated into several buckets in object storage leveraging a medallion architecture
Potential design improvements not depicted on this deployment for simplicity's sake include:
- Leveraging a full CIS-compliant landing zone
- Leverage a network firewall to improve the overall security posture by inspecting all traffic and by enforcing policies
Recommendations

Use the following recommendations as a starting point to process streaming data and a broad range of enterprise data resources for business analysis and machine learning.

Your requirements might differ from the architecture described here.
- Oracle Autonomous Data Warehouse This architecture uses Oracle Autonomous Data Warehouse on shared infrastructure.
  - Enable auto scaling to give the database workloads up to three times the processing power.
  - Consider using Oracle Autonomous Data Warehouse on dedicated infrastructure if you want the self-service database capability within a private database cloud environment running on the public cloud.
  - Consider using the hybrid partitioned tables feature of Autonomous Data Warehouse to move partitions of data to Oracle Cloud Infrastructure Object Storage and serve them to users and applications transparently. We recommend that you use this feature for data that is not often consumed and for which you don't need the same performance as for data stored within Autonomous Data Warehouse .
  - Consider using the external tables feature to consume data stored in Oracle Cloud Infrastructure Object Storage in real time without the need to replicate it to Autonomous Data Warehouse . This feature transparently and seamlessly joins data sets curated outside of Autonomous Data Warehouse , regardless of the format (parquet, avro, orc, json, csv, and so on), with data residing Autonomous Data Warehouse .
  - Consider using Autonomous Data Lake Accelerator when consuming object storage data to deliver an improved and faster experience to users consuming and joining data between the data warehouse and the data lake.
  - Consider using Analytic Views to model semantically the DW star or snowflake underlying schema directly in ADW so that granular data is automatically aggregated without the need to preaggregate it, the semantic model is consumed by using SQL consistently with any SQL compliant client, including Oracle Analytics Cloud, ensuring facts and KPIs are served consistently regardless of the client, and all data can be used on the semantic model regardless if it is stored in ADW or in Object Storage making this feature a perfect semantic modeling layer for a lakehouse architecture where facts and dimensions can traverse both the DW and the Lake.
  - Consider using Customer Managed Keys leveraging the Vault service if a full control of ADW encryption keys is needed due to company or regulation policies.
  - Consider using Database Vault in ADW to prevent unauthorized privileged users from accessing sensitive data and thus prevent data exfiltration and data breaches.
  - Consider using Autonomous Data Guard to support a business continuity plan via setting up and keeping data replicated on a standby instance either on the same region or on another region.
  - Consider using dynamic data masking with Data Redaction to serve masked data to users depending on their role and hence guaranteeing appropriate data access without the need for data duplication and static masking.
  - Consider organizing your lake across different sets of buckets leveraging a medallion architecture (bronze, silver, gold) or other partitioning logic to segregate data based on its quality and enrichment, enforce fine-grained security for consumers reading the data, and apply different lifecycle management policies to the different tiers.
  - Consider using different object storage tiers and lifecycle policies to optimize costs of storing lake data at scale.
  - Consider using Customer Managed Keys leveraging the Vault service if a full control of Object Storage encryption keys is needed due to company or regulation policies.
  - Consider using Object Storage replication to support a business continuity plan via setting up bucket replication to another region. Since Object Storage is highly durable and maintains several copies of the same object in a single region for recovery on the same region bucket replication is not needed.
  - Consider using AutoML in OCI Data Science or Oracle Machine Learning to speed up ML model development.
  - Consider using Open Neural Networks Exchange (ONNX) for interoperability. ONNX 3rd party models can be deployed either into OML and exposed as a REST endpoint or into OCI Data Science and exposed as an HTTP endpoint.
  - Consider saving the model in OCI Data Science as ONNX and import it into OCI GoldenGate Stream Analytics if there is a need to run scoring and prediction in a real time data pipeline to have more timely predictions that can drive real time business outcomes.
  - Consider using OCI Data Science Conda environments for better management and packaging of Python dependencies inside Jupyter notebook sessions. Leverage the Anaconda curated repository of packages within OCI Data Science to use your favorite open-source tools to build, train, and deploy models.
  - Consider using OCI Data Flow within the Data Science Jupyter environment to perform Exploratory Data Analysis, data profiling and data preparation at scale leveraging Spark scale out processing.
  - Consider using Data Labeling to label data such as images, text or documents and use it to train ML models built on OCI Data Science or OCI AI Services and thus improving the accuracy of predictions.
  - Consider deploying an API Gateway to secure and govern the consumption of the deployed model if real-time predictions are being consumed by partners and external entities.
  - Leverage the Oracle Cloud Infrastructure Data Integration to coordinate and schedule Oracle Cloud Infrastructure Data Flow application runs and be able to mix and match declarative ETL with custom Spark code logic. Use functions from within Oracle Cloud Infrastructure Data Integration to further extend the capabilities of data pipelines.
  - Consider using SQL pushdown for transformations that have ADW as target to use an ELT approach that is more efficient, performant and secure compared with ETL.
  - Consider allowing OCI Data Integration to handle data sources schema drift in order to have more resilient and future proof data pipelines that will sustain data sources schema changes.
  - Consider using Oracle Cloud Infrastructure Data Catalog as a Hive metastore for Oracle Cloud Infrastructure Data Flow in order to securely store and retrieve schema definitions for objects in unstructured and semi-structured data assets such as Oracle Cloud Infrastructure Object Storage .
  - Consider using Delta Lake on OCI Data Flow if ACID transactions and unification of streaming and batch processing is needed for lake data.
  - Consider using autoscaling to automatically scale horizontally or vertically the worker nodes based on metrics or schedule to continuously optimize costs based on resources demand.
  - Consider using the OCI HDFS connector for Object Storage to read and write data to and from Object Storage an thus providing a mechanism to produce/consume data shared with other OCI services without the need to replicate and duplicate it.
  - Consider using Delta Lake on OCI BDS if ACID transactions and unification of streaming and batch processing is needed for lake data.
  - For predictive maintenance and anomaly detection use cases, consider using the Oracle Cloud Infrastructure Anomaly Detection service that helps identifying anomalies in a multivariate data set by taking advantage of the interrelationship among signals.
  - Consider using Data Labeling to label train data that will be used to tune and get more accurate predictions for AI Services such as Vision, Document Understanding and Language.
  - Consider using Oracle Cloud Infrastructure Functions to add run-time logic eventually needed to support specific API processing that is out of scope of the data processing and access and interpretation layers.
  - Consider using Usage Plans to manage subscriber access to APIs, to monitor and manage API consumption, set up different access tiers for different consumers and support data monetization by tracking usage metrics that can be provided to an external billing system.
  Considerations
  
  When collecting, processing, and curating application data for analysis and machine learning, consider the following implementation options.
  
  Oracle Cloud Infrastructure Data Integration provides a cloud native, serverless, fully-managed ETL platform that is scalable and cost efficient.
  
  Oracle Cloud Infrastructure GoldenGate provides a cloud native, serverless, fully-managed, non-intrusive data replication platform that is scalable, cost efficient and can be deployed in hybrid environments.
  
  Oracle Autonomous Data Warehouse is an easy-to-use, fully autonomous database that scales elastically, delivers fast query performance, and requires no database administration. It also offers direct access to the data from object storage external or hybrid partitioned tables.
  
  Oracle Cloud Infrastructure Object Storage stores unlimited data in raw format.
  
  Oracle Cloud Infrastructure Data Integration provides a cloud native, serverless, fully-managed ETL platform that is scalable and cost effective.
  
  Oracle Cloud Infrastructure Data Flow provides a serverless Spark environment to process data at scale with a pay-per-use, extremely elastic model.
  
  Oracle Cloud Infrastructure Big Data Service provides enterprise-grade Hadoop-as-a-service with end-to-end security, high performance, and ease of management and upgradeability.
  
  Oracle Analytics Cloud is fully managed and tightly integrated with the curated data in Oracle Autonomous Data Warehouse .
  
  Data Science is a fully-managed, self-service platform for data science teams to build, train, and manage machine learning (ML) models in Oracle Cloud Infrastructure . The Data Science service provides infrastructure and data science tools such as AutoML and model deployment capabilities.
  
  Oracle Machine Learning is a fully-managed, self service platform for data science available with Oracle Autonomous Data Warehouse that leverages the processing power of the warehouse to build, train, test, and deploy ML models at scale without the need to move the data outside of the warehouse.
  
  Oracle Cloud Infrastructure AI services are a set of services that provide pre-built models specifically built and trained to perform tasks such as inferencing potential anomalies or detecting sentiment.
  
  Deploy
  
  The Terraform code for this reference architecture is available in GitHub. You can pull the code into Oracle Cloud Infrastructure Resource Manager with a single click, create the stack, and deploy it. Alternatively, you can download the code from GitHub to your computer, customize the code, and deploy the architecture by using the Terraform CLI.
  - Deploy by using Oracle Cloud Infrastructure Resource Manager :
    1. Click If you aren't already signed in, enter the tenancy and user credentials.
    2. Review and accept the terms and conditions.
    3. Select the region where you want to deploy the stack.
    4. Follow the on-screen prompts and instructions to create the stack.
    5. After creating the stack, click Terraform Actions , and select Plan .
    6. Wait for the job to be completed, and review the plan. To make any changes, return to the Stack Details page, click Edit Stack , and make the required changes. Then, run the Plan action again.
    7. If no further changes are necessary, return to the Stack Details page, click Terraform Actions , and select Apply .
  - Deploy by using the Terraform CLI:
    1. Go to GitHub.
    2. Clone or download the repository to your local computer.
    3. Follow the instructions in the README document.
  Explore More
  
  Learn more about the features of this architecture and about related architectures.
  - Oracle Data Platform
  - Learn about designing data lakes in Oracle Cloud
  - Secure your workloads using Oracle Cloud Infrastructure Network Firewall Service
  - Deploy a secure landing zone that meets the CIS Foundations Benchmark for Oracle Cloud
  - Best practices framework for Oracle Cloud Infrastructure
  - Oracle Cloud Infrastructure documentation
  - Oracle Cloud Cost Estimator

Oracle big data reference architecture

Functional Architecture

Physical Architecture

Recommendations

Considerations

Deploy

Explore More