AI, Public Data Sets, Real-Time: Strata + Hadoop Keynote Sampling

Strata + Hadoop keynotes included updates on the state of AI, new public data sets and programs from the US Department of Commerce, a closer look at what real-time data means for big data, and more. Here’s a sampling of some of our favorite keynotes from this week’s event.
InformationWeek: Cloud

With Cloud Dataproc, Google promises a Hadoop or Spark cluster in 90 seconds

Getting insights out of big data is typically neither quick nor easy, but Google is aiming to change all that with a new, managed service for Hadoop and Spark.

Cloud Dataproc, which the search giant launched into open beta on Wednesday, is a new piece of its big data portfolio that’s designed to help companies create clusters quickly, manage them easily and turn them off when they’re not needed.

Enterprises often struggle with getting the most out of rapidly evolving big data technology, said Holger Mueller, a vice president and principal analyst with Constellation Research.

“It’s often not easy for the average enterprise to install and operate,” he said. When two open source products need to be combined, “things can get even more complex.”

To read this article in full or to leave a comment, please click here

Computerworld Cloud Computing

SAP brings Hadoop into the Hana fold with Vora, a new in-memory analytics tool

A new tool from SAP will allow companies to analyze distributed Hadoop data alongside corporate data using the ERP giant’s Hana in-memory computing platform.

Announced on Tuesday, SAP Hana Vora is an in-memory query engine that taps the Apache Spark execution framework to deliver interactive analytics on Hadoop.

By extending Hana’s reach to include distributed data in the Hadoop ecosystem, the tool is designed to help data scientists and developers combine corporate and external data in their analyses. That, in turn, means that incoming data from customers, partners and smart devices can be integrated with that from internal enterprise processes, giving companies better context with which to make decisions, SAP said.

To read this article in full or to leave a comment, please click here

CIO Cloud Computing

Oracle Hadoop Based Analytical Tools to Explore the Spatial Big Data Processing

Oracle Oracle Hadoop Based Analytical Tools to Explore the Spatial Big Data ProcessingThe elephant of Apache Hadoop is increasingly acclaimed by thousands of developers and companies around the world. As big data and the demands of real-time analytics increase globally, the emergence of Hadoop has created new oceans to explore data.

Now, Oracle has a new software product that is designed to help big data demands. This product called Oracle Big Data Spatial and Graph provides new analytical capabilities for Hadoop and NoSQL.

Users of the Oracle database have long had access to graphical tools and analytic space, which are used to discover relationships and analyze data sets involving location. With the intention to meet diverse data sets and minimize the need for data movement, Oracle created the product so that it can process data natively on Hadoop and parallel on MapReduce using structures in memory.

There are two main components. One is a graph of property distributed to more than 35 high-performance analytic functions, parallel and in memory. The other is a collection of functions and services of spatial analysis to evaluate data based on how close or far you find something, whether it falls within a border or region, or for processing and displaying data and geospatial imagery. Analysts can then discover relationships and connections between clients, organizations and assets.

The Property Graph Data Management and Analysis facilitate the work on big data with the opportunity to develop models in real time, thanks to parallel in-memory analytics. Graphs are flexible and easy to evolve while the metadata is stored as part of the new graphs and reports findings can be added on the fly. With space instruments, users can take the data with location information, enrich them and use them to harmonize the whole environment.

According to the Oracle post, “With the spatial capabilities, users can take data with any location information, enrich it, and use it to harmonize their data. For example, Big Data Spatial and Graph can look at datasets like Twitter feeds that include a zip code or street address, and add or update city, state, and country information. It can also filter or group results based on spatial relationships: for example, filtering customer data from logfiles based on how near one customer is to another, or finding how many customers are in each sales territory. These results can be visualized on a map with the included HTML5-based web mapping tool. Location can be used as a universal key across disparate data commonly found in Hadoop-based analytic solutions.”

The Big Data Discovery analytic tool is the Oracle’s framework of big data Hadoop processing to profile, explore, analyze and find correlations in data from a Hadoop system. Last month, Oracle extended its middleware Data Integrator, which referred to specialists for database and data warehousing to engage in activities associated with big data. The Oracle Data Integrator solution for big data aim of helping companies to make data without learning Scala, Oozie or ETL, allowing to generate transformations in these languages ​​with simple mappings.