by Jim Jagielski and Sally Khudairi
As the world’s largest and one of the most influential Open Source foundations, The Apache Software Foundation (ASF) is home to more than 350 community-led projects and initiatives. The ASF’s 731 individual Members and more than 7,000 Committers are global, diverse, and often embodies a case of collective humility. We’ve assembled a list of 20 ubiquitous and up-and-coming Apache projects to celebrate the ASF’s 20th Anniversary on 26 March 2019, applaud our all-volunteer community, and thank the billions of users who benefit from their Herculean efforts.
1. Apache HTTP Server
The most popular Open Source HTTP server on the planet shot to fame just 13 months from its inception in 1995, and remains so today due to its ability to provide a secure, efficient and extensible server that provides HTTP services observing the latest HTTP standards. Serving modern operating systems including UNIX, Microsoft Windows, and Mac OS/X, the Apache HTTP Server played a key role in the initial growth of the World Wide Web; its rapid adoption over all other Web servers combined was also instrumental to the wide proliferation of eCommerce sites and solutions. The Apache HTTP Server project was the ASF’s flagship project at its launch, and served as the basis upon which future Apache projects emulated with its open, community-driven, merit-based development process known as “The Apache Way”.
2. Apache Incubator
The Apache Incubator is the ASF’s nexus for innovation, serving as the entry path for projects and codebases wishing to officially become part of the efforts at The Apache Software Foundation. All code donations from external organizations and existing external projects go through the incubation process to ensure all donations are in accordance with the ASF legal standards, and develop diverse communities that adhere to the ASF’s guiding principles. Incubation is required of newly accepted projects until their infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. Whilst incubation is neither a reflection of the completeness or stability of the code, nor does it indicate that the project has yet to be fully endorsed by the ASF, its rigorous process of mentoring projects and their communities according to “The Apache Way” has led to the graduation of nearly 200 projects in the Incubator’s 16-year history. Today 51 “podlings” are undergoing development in the Apache Incubator across an array of categories, including annotation, artificial intelligence, Big Data, cryptography, data science/storage/visualization, development environments, Edge and IoT, email, JavaEE, libraries, machine learning, serverless computing, and more.
3. Apache Kafka
The Apache footprint as the foundation of the Big Data ecosystem continues to grow, from Accumulo to Hadoop to ZooKeeper, with fifty active projects to date and two dozen more in the Apache Incubator. Apache Kafka’s highly-performant distributed, fault tolerant, real-time publish-subscribe messaging platform powers Big Data solutions at Airbnb, LinkedIn, MailChimp, Netflix, The New York Times, Oracle, PayPal, Pinterest, Spotify, Twitter, Uber, Wikimedia Foundation, and countless other businesses.
4. Apache Maven
Spinning out of the Apache Turbine servlet framework project in 2004, Apache Maven has risen to the top as the hugely popular build automation tool that helps Java developers build and release software. Stable, flexible, and feature-rich, Maven streamlines continuous builds, integration, testing, and delivery processes with an impressive central repository and robust plug-in ecosystem, making it the go-to choice for developers who want to easily manage a project’s build, reporting, and documentation.
5. Apache CloudStack
Super-quick to deploy, well-documented, and with an easy production environment, one of the biggest draws to Apache CloudStack is that it “just works”. Powering some of the industry’s most visible Clouds –from global hosting providers to telcos to the Fortune 100 top 5% and more– the CloudStack community is cohesive, agile, and focused, leveraging 11 years of Cloud success to enable users to rapidly and affordably build fully featured clouds.
6. Apache cTAKES
Developed from real-world use at the Mayo Clinic in 2006, cTAKES was created by a team of physicians, computer scientists and software engineers seeking a natural language processing system for extraction of information from electronic medical record clinical free-text. Today Apache cTAKES is an integral part of the Mayo Clinic’s electronic medical records and has processed more than 80 million clinical notes. Apache cTAKES is a growing standard for clinical data management infrastructure across hospitals and academic institutions that include Boston Children’s Hospital, Cincinnati Children’s Hospital, Massachusetts Institute of Technology, University of Colorado Boulder, University of Pittsburgh, and University of California San Diego, as well as companies such as Wired Informatics.
7. Apache Ignite
Apache Ignite is used for transactional, analytical, and streaming workloads at petabyte scale for the likes of American Airlines, ING, Yahoo Japan and countless others on premises, on cloud platforms, or in hybrid environments. Apache Ignite’s in-memory data fabric provides an in-memory data grid, compute grid, streaming, and acceleration solutions across the Apache Big Data system ecosystem, including Apache Cassandra, Apache Hadoop, Apache Spark, and more.
8. Apache CouchDB
Thousands of organizations such as the BBC, GrubHub, and the Large Hadron Collider use Apache CouchDB for seamless data flow between every imaginable computing environment, from globally-distributed server clusters to mobile devices to Web browsers. Its Couch Replication Protocol allows you to store, retrieve, and replicate data safely on premises or on the Cloud with very high performance reliability. Apache CouchDB does all the heavy lifting so you can sit back and relax.
9. Apache Edgent (incubating)
The boom of IoT –personal assistants, smart phones, smart homes, connected cars, Industry 4.0 and beyond– is producing an ever-growing amount of data streaming from millions of systems, sensors, equipment, vehicles and more. The demand for reliable, efficient real-time data has driven the need for the “Empowered Edge”, where data collection and analysis is optimized by moving away from centralized sources towards the edges of of the networks, where much of the data originates. Companies like IBM and SAP are leveraging Apache Edgent to accelerate analytics at the edge across the IoT ecosystem. Apache Edgent can be used in conjunction with many Apache data analytics solutions such as Apache Flink, Apache Kafka, Apache Samza, Apache Spark, Apache Storm, and more.
10. Apache OFBiz
Whereas most of the ASF projects are about running or creating infrastructure, we also realize the importance of running and handling a business. Apache OFBiz is a comprehensive suite of business applications from accounting and CRM through Warehousing and Inventory control. The Java based framework provides the power and the flexibility to serve as the core of one’s B2B and B2C business management and is easily expandable and customizable. Apache OFBiz is a complete ERP solution, flexible, free, and fully Open Source and services users from United Airlines to Cabi.
11. Apache SIS (Spatial Information System)
The US National Oceanic and Atmospheric Administration, Vietnamese National Space Center, numerous spatial agencies, governments, and others rely on Apache SIS to create their own intelligent, standards-based interoperable geospatial applications. The Apache SIS toolkit handles spatial data, location awareness, geospatial data representation, and provides a unified metadata model for file formats used for real-time smart city visualization, geospatial dataset discovery, state-of-the-art location-enabled emergency management, earth observation, as well as information modeling for extra-terrestrial bodies such as Mars and asteroids.
12. Apache Syncope
Apache Syncope manages digital identity data in enterprise applications and environments to handle user information such as username, password, first name, last name, email address, etc. Identity management involves considering user attributes, roles, resources and entitlements that control who access to what data, when, how, and why. Apache Syncope users include the Italian Army, the University of Helsinki, University of Milan, and the SWITCH Swiss university network.
13. Apache PLC4X (incubating)
Connectivity and integration across many Industrial IoT edge gateways is often impossible with closed-source, proprietary legacy systems with incompatible protocols. Apache PLC4X provides a universal protocol adapter for creating Industrial IoT applications through a set of libraries that allow unified access to any type of industrial programmable logic controllers (PLCs) using a variety of protocols with a shared API. In addition, the project is planning integrations modular to Apache IoT projects that include Apache Brooklyn, Apache Camel, Edgent, Apache Kafka, Apache Mynewt, and Apache NiFi.
14. Apache Commons
With 42%+ of Apache projects written in Java (that’s 62+ million lines of code), having a set of stable, reusable Open Source Java software components available to all Apache projects and external users is both helpful and necessary. Apache Commons provides a suite of dozens of stable, reusable, easily deployed Java components, and a workspace for Commons contributors to collaborate on the development of new components.
15. Apache Spark
Big Data is growing exponentially each year, accelerated by industries such as agriculture, big business, FinTech, healthcare, IoT, manufacturing, mobile advertising and more. Apache Spark’s unified analytics engine for processing and analyzing large-scale data processing helps data scientists apply machine learning insights and an array of libraries to improve responsiveness more accurate results. Apache Spark runs workloads 100x faster on Apache Hadoop, Apache Mesos, Kubernetes, whether standalone or in the cloud, and to access diverse data sources, from Apache Cassandra, Apache Hadoop HDFS, Apache HBase, Apache Hive, and hundreds of others.
16. Apache Cordova
Apache Cordova is the popular developer tool used to easily build cross-platform, cross-device mobile apps using a Write-Once-Run-Anywhere solution, which enabling developers to create a single app that will appear the same across multiple mobile device platforms. Apache Cordova acts as an extensible container, and serves as the base that most mobile application development tools and frameworks are built upon, including mobile development platforms and commercial software products by Blackberry, Google, IBM, Intel, Microsoft, Oracle, Salesforce, and many others.
17. Apache Tomcat
Starting off as the Apache JServ project, designed to allow for Java “servlets” to be run in a Web environment, Tomcat grew to being a full-fledged, comprehensive Java Application server and was the de-facto reference implementation for the Java specifications. Since 2005, Apache Tomcat has formed, and still forms, the foundation of numerous Java-based web infrastructures such as eBay, E*Trade, WalMart, and The Weather Channel.
18. Apache Lucene/Solr
Adobe, AOL, Apple, AT&T, Bank of America, Bloomberg, Cisco, Disney, eTrade, Ford, The Guardian, Homeland Security, Instagram, MTV Networks, NASA Planetary Data System, Netflix, SourceForge, Verizon, Walmart, whitehouse.gov, Zappos, and countless others turn to Apache Lucene Solr to quickly and reliably index and search multiple sites and enterprise data such as documents and email. Popular features include near real-time indexing, automated failover and recovery, rich document parsing and indexing, user-extensible caching, design for high-volume traffic, and much more.
19. Apache Wicket
The Apache Wicket component-based Web application framework is prized by many followers for its “Plain Old Java Object” (POJO) data model and markup/logic separation not common in most frameworks. Developers have been using Apache Wicket since 2004 to quickly create powerful, reusable components using object oriented methodology with Java and HTML. Wicket powers thousands of applications and sites for governments, stores, universities, cities, banks, email providers, and more, including Apress, DHL, SAP, Vodafone, and Xbox.com.
20. Apache Daffodil (incubating)
Governments handle massive amounts of complex and legacy data across security boundaries every day. In order for such data to be consumed, it must be inspected for correctness and sanitized of malicious data. Whilst traditional inspection methods are often proprietary, incomplete, and poorly maintained, Apache Daffodil streamlines the process with an Open Source implementation of the Data Format Description Language specification (DFDL) that fully describes a wide array of complex and legacy file formats down to the bit level. Daffodil can parse data to XML or JSON to allow for validation, sanitization, and transformation, and also serialize or ”unparse” back to the original file format, effectively mitigating a large variety of common vulnerabilities.
The Apache Software Foundation is a leader in community-driven open source software and continues to innovate with dozens of new projects and their communities. Apache projects are managing exabytes of data, executing teraflops of operations, and storing billions of objects in virtually every industry. Apache software is an integral part of nearly every end user computing device, from laptops to tablets to phones. The commercially-friendly and permissive Apache License v2.0 has become an open source industry standard. As the demand for quality open source software continues to grow, the collective Apache community will continue to rise to the challenge of solving current problems and ideate tomorrow’s opportunities through The Apache Way of open development. Learn more at http://apache.org/
# # #