Hadoop is a framework that allows for distributed processing of large data sets across clusters of commodity computers using a simple programming model.