Help us build a better product, earn the PaxPro Advocate badge: Six short questions, one cool badge.

Visit the official Paxata Documentation portal for all of your doc needs.

Oracle DB connection and processing

Say, I want to process data in Oracle DB and transform it using some tables in Hive DWH and publish the result as  new Hive table -- Will that be possible in Paxata?

Is connecting to Oracle Datawarehouse seamless in Paxata?

How does the processing happen? Paxata just pulls the data from Oracle directly into Spark everytime the project is run?

Best Answer

Answers

  • @bstephens
    Thank you for a great explanation! Now, I better appreciate the Data Library part. It looks like it may have to support massive data if needed. Is Azure blob storage a commonly used option for Data Library?
  • MagmaManMagmaMan Posts: 15
    edited January 11, 2019 7:19AM
    NOTE: The following comment is wrong. Please ignore. Direct Publish is when Spark directly talks with Data Library. So, data ultimately is always loaded from Data Library only either directly or through Data core servers (and not from Data sources like how I have WRONGLY explained below). Apologies.

    EARLIER WRONG COMMENT:
    I just found that there is a Direct Data Load mode for Tenants. If that is enabled, the Spark workers will directly load data from the source instead of using Data Library!!! Otherwise, Data Core servers need to be installed which can bring in data from Data Library and then Spark workers talk to the Data core servers.
Sign In or Register to comment.