How Amazon Kinesis can help you leverage real-time data

This contributed piece has been edited and approved by Network World editors

The more real-time data about customers, processes or competitors you can capture and analyze, the better you’ll be able to react quickly to important events.  Amazon Kinesis is a cloud tool we really like because you can use it to leverage your business’ real-time data without having to worry about having enough storage and server capacity to process all of that data.

Amazon Kinesis is a suite of tools from Amazon Web Services that makes it easy for companies to capture, process and analyze real-time streaming data. Kinesis has three components:

To read this article in full or to leave a comment, please click here

Network World Cloud Computing

With Amazon Kinesis Analytics, devs can analyze real-time data with SQL

Amazon launched a new tool on Thursday aimed at helping developers build applications that offer insights from a firehose of data in real time. Kinesis Analytics will let users set up SQL queries that run on data that’s constantly updating, expanding the reach of the popular data analysis language beyond traditional database applications. 

Once a user has set up a Kinesis Analytics stream, the results can then be routed to up to four different services, including Amazon S3, Redshift, and Elasticsearch Service.

It’s a service that’s useful for bringing in data from sources that are rapidly shifting in real time, like sensor information from the internet of things, or live data from a stock market. That’s key as more and more companies start leaning on big sets of live data to help drive business applications. 

To read this article in full or to leave a comment, please click here

Computerworld Cloud Computing

FICO Predictive Suite Updated, Real-Time Rises: Big Data Roundup

Real-time open source data projects gain momentum, FICO updates its predictive analytics suite, Samsung introduces new devices for IoT, and a new contest for creating tools to analyze satellite data is launched in this Big Data Roundup for the week ending May 1, 2016.
InformationWeek: Cloud

AI, Public Data Sets, Real-Time: Strata + Hadoop Keynote Sampling

Strata + Hadoop keynotes included updates on the state of AI, new public data sets and programs from the US Department of Commerce, a closer look at what real-time data means for big data, and more. Here’s a sampling of some of our favorite keynotes from this week’s event.
InformationWeek: Cloud

MapR Enhances its Real-Time Processing Capabilities for Big Data Analysis

MapR Logo 300x70 MapR Enhances its Real Time Processing Capabilities for Big Data AnalysisThe big data platform MapR just introduced version 5.0 of its Hadoop distribution based on version 2.7 of the open source framework designed for the processing of very large volumes of data with the support for Docker containers. MapR 5.0 also relies on the Yarn resource manager.

This version strengthens the operational capacity real-time platform. In particular, it extended the highly reliable data transport framework used in the function table MapR-DB Replication (which allows replication between multiple data centers) to provide data to external motors and synchronize in real time.

Compared to other Hadoop distributions, MapR extends the functionality of the framework on security aspects (data protection, user authentication, disaster recovery), but also high availability and performance. Version 5.0 brings further improvements in governance, with a full audit access to data through JSON and Apache Drill Views of support for secure access to data analyze.

More and more companies deploy multiple applications on the same Hadoop cluster. In this context, the latest MapR manages automated synchronization of storage, databases and search index.

To facilitate the deployment of Hadoop clusters, the publisher has also included new models of self-provisioning to set up a cluster as if it were an appliance without using specific hardware. These models can be deployed using the MapR installer. Among the possible configurations, there are the Lake Data services, data mining (Interactive SQL with Apache Drill) and analysis of operational data (basic and MapR NoSQL-DB).

The Apache project will help in the analysis and the use of batch processes and their pipelines with rapid and extensive calculations. The announced distribution automatically synced storage, databases and search indices to allow complex real-time applications. It also has new auditing capabilities.

MapR Technologies intends to continue its growth in big data and analytics-segment. In the context of the MapR database now has the ability to the table replication to synchronize data in real time and make it available for external calculators. The first case that is based on Lucene search platform Elasticsearch is supported to enable synchronized full-text search indexes automatically.

Last year, MapR and Apache Spark integrated their technologies to offer its users an all-around the clock support for Spark to develop the solution and related projects at a faster rate and to integrate more innovative changes. In addition, the two companies are working together on a rapid development of the software and other complementary innovative new features. This will pay off for MapR customers and the Hadoop community well over the coming years.

Recently, Oracle released a new software product that is designed to help big data demands. This product called Oracle Big Data Spatial and Graph provides new analytical capabilities for Hadoop and NoSQL. Oracle created the product so that it can process data natively on Hadoop and parallel on MapReduce using structures in memory.


CloudTimes