• What we do
  • The People
  • About Us
  • Why Innovation Africa
  • Contact Us
Innovation AfricaCreating the Future Today
  • Feature Articles
  • Innovation
  • Agriculture
  • ICT
  • Technology
  • Entrepreneurship
  • Health
  • Store
  • Contact Us
Menu
  • Feature Articles
  • Innovation
  • Agriculture
  • ICT
  • Technology
  • Entrepreneurship
  • Health
  • Store
  • Contact Us
  • Big data, interactive access: How Apache Drill makes it easy

    July 30, 2015 Editor 0

    Public_domain_image_Britsh_Library_Flickr

    Register for the free webcast “Easy, real-time access to data with Apache Drill,” which will be held Thursday, July 30, 2015, at 10 a.m. PT. This panel discussion will explore the major role SQL-on-Hadoop technologies play in organizations.

    Big data techniques are becoming mainstream in an increasing number of businesses, but how do people get self-service, interactive access to their big data? And how do they do this without having to train their SQL-literate employees to be advanced developers?

    One solution is to take advantage of the rapidly maturing open source, open community software tool known as Apache Drill. Drill is not the first SQL-on-Hadoop tool. It is, however, a new and very sophisticated highly scalable SQL query engine that has been built from the ground up to be appropriate for use even in production settings. Drill extends query capabilities to a variety of new data sources and formats without the requirement for IT intervention that might be expected from a SQL query engine. In short, Drill allows self-exploration of data by providing flexibility along with performance.

    As capabilities in the big data world have progressed, our understanding of what is needed for high-performance, enterprise-grade architectures have also increased. A need for a SQL solution for the Hadoop and NoSQL space was recognized fairly early, and it’s not surprising that to meet an urgent need, some of the first tools approached the problem with SQL-like syntax and made compromises that led to limitations in the data sources and formats they could handle well.

    For those of you who are early Hadoop users, you’ve built up experience with SQL-like queries on data stored in Hadoop-based platforms, but most likely, you’ve also got a mental “wish list” of the things you’d like to see improved or added in a SQL-on-Hadoop query engine. The good news for you is that Apache Drill addresses many items likely to be on your wish list.

    And for those of you new to the big data world of Hadoop and NoSQL, your good news is that Drill just made it much easier to step into this space and take advantage of new data sources and cost-effective platforms for handling data at scale. How? Drill does this in part by letting you build on your years of experience with SQL and familiar BI tools. Drill is SQL, not “SQL-like.”  But Drill is more than just another SQL query system: Drill also extends SQL and handles complex data cleanly without a requirement for a pre-defined schema.

    Having been built to meet the needs that were projected for a maturing big data arena, Drill is a tool that should prove valuable in current use cases.

    Look at these examples of what Drill can do:

    • Drill uses standard ANSI SQL syntax for queries to meet the need for familiar tools and approaches, especially for business analysts.
    • Drill views address the need for more granular security for big data.
    • Drill was designed for performance on data stored with the new columnar data formats such as Apache Parquet even when the data stored in this format uses complex structures (nested data).

    How useful is Drill’s performance with Parquet data? Very. Increasingly, people want to take advantage of Parquet’s compression and efficient structure. Netflix and Twitter are two well-known organizations using Parquet at large scale. Drill also handles nested data stored as JSON, another data format being widely used, in part thanks to widespread use of JavaScript in Web applications. Drill can even analyze the heavily nested data contained in Drill’s own performance logs (or even Impala’s logs). Other query tools, especially those based on SQL, have a very hard time dealing with such complex data.

    This is an exciting time to get involved with Drill. Version 1.1 was released less than a month ago, with some excellent additions, including automatic partitioning of Parquet files, new windowing functions, FLATTEN support for very large complex objects, improved data access via a better JDBC driver, and improvements for the MongoDB plug-in.

    Year of the Drill user

    One of the most exciting aspects of the Apache Drill project is the new round of innovation underway as the community of users grows. Users are putting Drill to work as it was originally designed to be used: to help them make better use of cost-effective, scalable Hadoop and NoSQL technologies, to get better time-to-value in their projects, and to discover new insights from their data.

    Users are also beginning to use Drill in ways we haven’t predicted; Drill users are now the innovators.

    To explore what Drill can do and how you might use it, come join a Drill discussion in the free O’Reilly Community webcast sponsored by MapR Technologies on Thursday July 30, 2015, at 10:00 a.m. PT. You’ll hear a conversation that includes the viewpoints from someone who helped build Drill and someone who uses it.

    The panel includes:

    • A leading Drill architect and Apache Drill PMC chair, Jacques Nadeau
    • An early end-user of Drill and Chief Architect for Data and Information Management at Cisco, Piyush Bhargava
    • A big data expert and Research Director at 451 Research, Matt Aslett
    • A Hadoop expert and VP of Marketing at MapR, Steve Wooledge

    Register for the webcast here.

    This post is a collaboration between O’Reilly and MapR. See our statement of editorial independence.

    Cropped public domain image via the British Library on Flickr.


    Go to Source

    Related Posts

    • How to Get Over Your Inaction on Big DataHow to Get Over Your Inaction on Big Data
    • How to Get More Value Out of Your Data AnalystsHow to Get More Value Out of Your Data Analysts
    • Does Bigger Data Lead to Better Decisions?Does Bigger Data Lead to Better Decisions?
    • Big data and the “Big Lie”: the challenges facing big brand marketersBig data and the “Big Lie”: the challenges facing big brand marketers
    • The dollars are in the detailThe dollars are in the detail
    • Crowd-funded micro-grants for genomics and an actionable idea connecting small (artisan) science, infrastructure science, and citizen philanthropy.Crowd-funded micro-grants for genomics and an actionable idea connecting small (artisan) science, infrastructure science, and citizen philanthropy.
    Sovrn
    Share

    Categories: Technology

    Tags: big data

    Shell Ideas360 | A global ideas competition for Technical Talents Why Agriculture Traceability Matters to Companies, Consumers and Communities

    Leave a Reply Cancel reply

    You must be logged in to post a comment.

Subscribe to our stories


 

Recent Posts

  • Entrepreneurial Alertness, Innovation Modes, And Business Models in Small- And Medium-Sized Enterprises December 30, 2021
  • The Strategic Role of Design in Driving Digital Innovation June 10, 2021
  • Correction to: Hybrid mosquitoes? Evidence from rural Tanzania on how local communities conceptualize and respond to modified mosquitoes as a tool for malaria control June 10, 2021
  • BRIEF FOCUS: Optimal spacing for groundnuts in smallholder farming systems June 9, 2021
  • COVID-19 pandemic: impacts on the achievements of Sustainable Development Goals in Africa June 9, 2021

Categories

Archives

Popular Post-All time

  • A review on biomass-based... 1k views
  • Apply Now: $500,000 for Y... 798 views
  • Can blockchain disrupt ge... 797 views
  • Test Your Value Propositi... 749 views
  • Prize-winning projects pr... 722 views

Recent Posts

  • Entrepreneurial Alertness, Innovation Modes, And Business Models in Small- And Medium-Sized Enterprises
  • The Strategic Role of Design in Driving Digital Innovation
  • Correction to: Hybrid mosquitoes? Evidence from rural Tanzania on how local communities conceptualize and respond to modified mosquitoes as a tool for malaria control
  • BRIEF FOCUS: Optimal spacing for groundnuts in smallholder farming systems
  • COVID-19 pandemic: impacts on the achievements of Sustainable Development Goals in Africa
  • Explicit knowledge networks and their relationship with productivity in SMEs
  • Intellectual property issues in artificial intelligence: specific reference to the service sector
  • Africa RISING publishes a livestock feed and forage production manual for Ethiopia
  • Transforming crop residues into a precious feed resource for small ruminants in northern Ghana
  • Photo report: West Africa project partners cap off 2020 with farmers field day events in Northern Ghana and Southern Mali

Tag Cloud

    africa African Agriculture Business Business model Business_Finance Company Crowdsourcing data Development East Africa economics Education Entrepreneur entrepreneurs Entrepreneurship ethiopia ghana Health_Medical_Pharma ict Information technology Innovation kenya knowledge Knowledge Management Leadership marketing mobile Mobile phone nigeria Open innovation Organization Research rwanda science Science and technology studies social enterprise social entrepreneurship south africa Strategic management strategy tanzania Technology Technology_Internet uganda

Categories

Archives

  • A review on biomass-based hydrogen production for renewable energy supply 1k views
  • Apply Now: $500,000 for Your Big Data Innovations in Agriculture 798 views
  • Can blockchain disrupt gender inequality? 797 views
  • Test Your Value Proposition: Supercharge Lean Startup and CustDev Principles 749 views
  • Prize-winning projects promote healthier eating, smarter crop investments 722 views

Copyright © 2005-2020 Innovation Africa Theme created by PWT. Powered by WordPress.org