• What we do
  • The People
  • About Us
  • Why Innovation Africa
  • Contact Us
Innovation AfricaCreating the Future Today
  • Feature Articles
  • Innovation
  • Agriculture
  • ICT
  • Technology
  • Entrepreneurship
  • Health
  • Store
  • Contact Us
Menu
  • Feature Articles
  • Innovation
  • Agriculture
  • ICT
  • Technology
  • Entrepreneurship
  • Health
  • Store
  • Contact Us
  • Big data’s big ideas

    November 2, 2014 Editor 0

    Big data’s big ideas

    Looking back at the evolution of our Strata events, and the data space in general, we marvel at the impressive data applications and tools now being employed by companies in many industries. Data is having an impact on business models and profitability. It’s hard to find a non-trivial application that doesn’t use data in a significant manner. Companies who use data and analytics to drive decision-making continue to outperform their peers.

    Up until recently, access to big data tools and techniques required significant expertise. But tools have improved and communities have formed to share best practices. We’re particularly excited about solutions that target new data sets and data types. In an era when the requisite data skill sets cut across traditional disciplines, companies have also started to emphasize the importance of processes, culture, and people.

    As we look into the future, here are the main topics that guide our current thinking about the data landscape.

    Note: This document represents our thinking as of October 2014. You can keep up with the latest analysis and developments in the data space through the O’Reilly Data newsletter.

    Cognitive augmentation

    The combination of big data, algorithms, and efficient user interfaces can be seen in consumer applications such as Waze or Google Now. Our interest in this topic stems from the many tools that democratize analytics and, in the process, empower domain experts and business analysts. In particular, novel visual interfaces are opening up new data sources and data types.

    Examples:

    • Narrative Science adds descriptive summaries to the output generated by business intelligence tools (dashboards, charts, and tables).
    • Palantir and Quid use a combination of visualization, search, and analytics that enable domain experts to discover patterns hidden in large data sets.
    • StitchFix provides product recommendations by combining proprietary algorithms and expert stylists.
    • “Moving dots” (e.g. tracking data from athletics) are being analyzed by companies that specialize in spatio-temporal pattern recognition. Startup Second Spectrum provides analytics to coaches and front offices in many professional basketball teams. In the near future, their technology and recommendations will be available in real time to coaching staffs during in-game situations.

    Intelligence matters: Artificial intelligence and algorithms

    Bring up the topic of algorithms, and a discussion on recent developments in artificial intelligence (AI) is sure to follow. AI is the subject of an ongoing series of posts on O’Reilly Radar. The “unreasonable effectiveness of data” notwithstanding, algorithms remain an important area of innovation. We’re excited about the broadening adoption of algorithms like deep learning, and topics like feature engineering, gradient boosting, and active learning. As intelligent systems become common, security and privacy become critical. We’re interested in efforts to make machine learning secure in adversarial environments.

    Related resources:

    • The “Intelligence Matters” series on O’Reilly Radar covers recent developments in artificial intelligence.
    • Streamlining Feature Engineering: O’Reilly Radar post on new tools that enable feature discovery.
    • Hardcore Data Science day at Strata + Hadoop World 2014 features deep learning and other algorithms, analytic techniques, and a fascinating machine-learning pipeline toolkit from UC Berkeley’s AMPLab.

    The convergence of cheap sensors, fast networks, and distributed computation

    The Internet of Things (IoT) will require systems that can process and unlock massive amounts of event data. These systems will draw from analytic platforms developed for monitoring IT operations. Beyond data management, we’re following recent developments in streaming analytics and the analysis of large numbers of time series.

    Related resources:

    • I ❤ Logs: Event Data, Stream Processing, and Data Integration: This is a new book from the co-creator of Apache Kafka.
    • Surfacing anomalies and patterns in Machine Data: O’Reilly Radar post on large-scale event data platforms that originate from the world of IT operations.
    • How Twitter monitors millions of time series: O’Reilly Radar post on a distributed, near-real-time system that simplifies the collection, storage, and mining of massive amounts of event data.
    • Data Analysis on Streams: A recent webcast on popular techniques in real-time analytics.

    Data (science) pipelines

    Analytic projects involve a series of steps that often require different tools. There are a growing number of companies and open source projects that integrate a variety of analytic tools into coherent user interfaces and packages. Many of these integrated tools enable replication, collaboration, and deployment. This remains an active area, as specialized tools rush to broaden their coverage of analytic pipelines.

    Examples and related resources:

    • Reproducing Data Projects: O’Reilly Radar post on popular approaches for reproducing, managing, and deploying complex data projects.
    • Project Jupyter: A new initiative from the creators of IPython.
    • Databricks Workspace: An impressive notebook interface that pulls together components of the Spark ecosystem.
    • Data Wrangling gets a fresh look: O’Reilly Radar post on new tools for data preparation.
    • Data Analysis is just one component of the Data Science workflow: An overview of modern data pipelines.

    Evolving, maturing marketplace of big data components

    Many popular components in the big data ecosystem are open source. As such, many companies build their data infrastructure and products by assembling components like Spark, Kafka, Cassandra, and ElasticSearch, among others. Contrast that to a few years ago when many of these components weren’t ready (or didn’t exist) and companies built similar technologies from scratch. But companies are interested in applications and analytic platforms, not individual components. To that end, demand is high for data engineers and architects who are skilled in maintaining robust data flows, data storage, and assembling these components.

    Examples and related resources:

    • Some popular Apache projects: Hadoop, Spark, Cassandra, Kafka, Mesos, ZooKeeper.
    • Big Data systems are making a difference in the fight against cancer: O’Reilly Radar post provides an example of how open source distributed computing tools can make a profound impact in the health care domain.
    • Verticalized big data solutions: O’Reilly Radar post on domain-specific big data applications.
    • Hadoop Application Architectures: A book on best practices for building data management solutions.
    • Designing Data-intensive Applications: A book that looks at how to build applications using some popular big data components.

    Data scientists, design, and social science

    To be clear, data analysts have always drawn from social science (e.g., surveys, psychometrics) and design. We are, however, noticing that many more data scientists are expanding their collaborations with product designers and social scientists.

    Examples and related resources:

    • IDEO’s Hybrid Insights group integrates quantitative techniques with the qualitative methods popular among product designers.
    • Datascope Analytics: A Chicago-based data science consulting group that incorporates techniques from product design.
    • Ideation (idea generation) workshops are beginning to be used by some data scientists.
    • Thinking with Data: This book by Max Shron provides an overview of ideas and techniques from the social sciences.

    Building a data culture

    “Data-driven” organizations excel at using data to improve decision-making. It all starts with instrumentation. “If you can’t measure it, you can’t fix it,” says DJ Patil, VP of product at RelateIQ. In addition, developments in distributed computing over the past decade have given rise to a group of (mostly technology) companies that excel in building data products. In many instances, data products evolve in stages (starting with a “minimum viable product”) and are built by cross-functional teams that embrace alternative analysis techniques.

    Related resources:

    • Building Data Science Teams: Data scientists are at the forefront of innovation in many data-driven organizations. This report offers practical advice for constructing teams that can drive that innovation.
    • Just Enough Math is a video series that introduces mathematical concepts using business cases.
    • Lean Analytics: Acquire a data-driven mindset through 30 case studies.
    • Data Jujitsu: A primer on organizing teams and building data products.

    Perils of big data

    Every few months, there seems to be an article criticizing the hype surrounding big data. Dig deeper and you find that many of the criticisms point to poor analysis and highlight issues known to experienced data analysts. Our perspective is that issues such as privacy and the cultural impact of models are much more significant.

    Examples and related resources:

    • On Being a Data Skeptic: A nuanced view of big data and data science.
    • Organizations like Code for America, Bayes Impact, Datakind, and Data & Society broaden the discussion of what data scientists can be working on and thinking about.
    • NIPS 2014 Workshop: Fairness, Accountability, and Transparency in Machine Learning: Researchers address “… growing anxieties about the role that machine learning plays in consequential decision-making in such areas as commerce, employment, health care, education, and policing.”
    • No silver bullet: De-identification still doesn’t work: Princeton security and privacy researchers survey anonymization strategies for a variety of data types.

    We’ll also explore each of these topics through our publishing program, events, webcasts, and online coverage. These explorations work best when they’re two-way roads, so please share your feedback through Twitter (@bigdata) or in the comments below.


    Go to Source

    Related Posts

    • Can Your C-Suite Handle Big Data?Can Your C-Suite Handle Big Data?
    • The ethics of artificial intelligenceThe ethics of artificial intelligence
    • We need open models, not just open dataWe need open models, not just open data
    • Artificial intelligence: summoning the demonArtificial intelligence: summoning the demon
    • The problem of managing schemasThe problem of managing schemas
    • The wisdom of crowds: The potential of online communities as a tool for data analysisThe wisdom of crowds: The potential of online communities as a tool for data analysis
    Sovrn
    Share

    Categories: Technology

    Tags: analytics, artificial intelligence, Business intelligence, Data analysis, Data management

    The next industrial revolution Using Global Insights to Drive Local Innovation

    Leave a Reply Cancel reply

    You must be logged in to post a comment.

Subscribe to our stories


 

Recent Posts

  • Entrepreneurial Alertness, Innovation Modes, And Business Models in Small- And Medium-Sized Enterprises December 30, 2021
  • The Strategic Role of Design in Driving Digital Innovation June 10, 2021
  • Correction to: Hybrid mosquitoes? Evidence from rural Tanzania on how local communities conceptualize and respond to modified mosquitoes as a tool for malaria control June 10, 2021
  • BRIEF FOCUS: Optimal spacing for groundnuts in smallholder farming systems June 9, 2021
  • COVID-19 pandemic: impacts on the achievements of Sustainable Development Goals in Africa June 9, 2021

Categories

Archives

Popular Post-All time

  • A review on biomass-based... 1k views
  • Apply Now: $500,000 for Y... 798 views
  • Can blockchain disrupt ge... 797 views
  • Test Your Value Propositi... 749 views
  • Prize-winning projects pr... 722 views

Recent Posts

  • Entrepreneurial Alertness, Innovation Modes, And Business Models in Small- And Medium-Sized Enterprises
  • The Strategic Role of Design in Driving Digital Innovation
  • Correction to: Hybrid mosquitoes? Evidence from rural Tanzania on how local communities conceptualize and respond to modified mosquitoes as a tool for malaria control
  • BRIEF FOCUS: Optimal spacing for groundnuts in smallholder farming systems
  • COVID-19 pandemic: impacts on the achievements of Sustainable Development Goals in Africa
  • Explicit knowledge networks and their relationship with productivity in SMEs
  • Intellectual property issues in artificial intelligence: specific reference to the service sector
  • Africa RISING publishes a livestock feed and forage production manual for Ethiopia
  • Transforming crop residues into a precious feed resource for small ruminants in northern Ghana
  • Photo report: West Africa project partners cap off 2020 with farmers field day events in Northern Ghana and Southern Mali

Tag Cloud

    africa African Agriculture Business Business model Business_Finance Company Crowdsourcing data Development East Africa economics Education Entrepreneur entrepreneurs Entrepreneurship ethiopia ghana Health_Medical_Pharma ict Information technology Innovation kenya knowledge Knowledge Management Leadership marketing mobile Mobile phone nigeria Open innovation Organization Research rwanda science Science and technology studies social enterprise social entrepreneurship south africa Strategic management strategy tanzania Technology Technology_Internet uganda

Categories

Archives

  • A review on biomass-based hydrogen production for renewable energy supply 1k views
  • Apply Now: $500,000 for Your Big Data Innovations in Agriculture 798 views
  • Can blockchain disrupt gender inequality? 797 views
  • Test Your Value Proposition: Supercharge Lean Startup and CustDev Principles 749 views
  • Prize-winning projects promote healthier eating, smarter crop investments 722 views

Copyright © 2005-2020 Innovation Africa Theme created by PWT. Powered by WordPress.org