Aravind Mohan

Big Data and Cloud Computing Helpful Resources: 

  • Amazon EC2 Cloud Setup:  (URL)
  • Apache Hadoop Cluster Setup:  (URL)
  • Apache Hadoop Map Reduce Java Programming Tutorial:  (URL)
  • MongoDB Cluster Setup:  (URL)
  • MongoDB Mapreduce Java Programming Tutorial:  (URL)
  • Cassandra Cluster Setup:(URL)
  • DATAVIEW Workflow Management System:  (URL)











Big Datasets;

  • grouplens dataset:(URL)
  • OpenXC Platform:(URL)
  • OpenXC Vehicle Trace Files(URL)
  • U.S. patent data:  (URL)
  • Public data sets on AWS (Amazon):  (URL)
  • The Lemur project ClueWeb09 dataset (1B web pages):  (URL)
  • U.S. Census genealogy data:  (URL)
  • Large health data sets (ehdp.com):  (URL)
  • Large health data sets (ehdp.com):  (URL)
  • The Quora website has a list of large, publicly-available datasets:  (URL)
  • A website named BigFastBlog has a list of large datasets:(URL)