bigdata.andreamostosi.nameThe Big-Data Ecosystem Table

bigdata.andreamostosi.name Profile

bigdata.andreamostosi.name

Maindomain:andreamostosi.name

Title:The Big-Data Ecosystem Table

Description:This page is built merging the Hadoop Ecosystem Table (by Javi Roman and other contributors) and projects list collected on my blog. Result is an incomplete-but-useful list of big-data related projects. If you like you can contribute to the original project or to my fork.

Discover bigdata.andreamostosi.name website stats, rating, details and status online.Use our online tools to find owner and admin contact info. Find out where is server located.Read and write reviews or vote to improve it ranking. Check alliedvsaxis duplicates with related css, domain relations, most used words, social networks references. Go to regular site

bigdata.andreamostosi.name Information

Website / Domain: bigdata.andreamostosi.name
HomePage size:296.019 KB
Page Load Time:0.164913 Seconds
Website IP Address: 172.67.129.134
Isp Server: CloudFlare Inc.

bigdata.andreamostosi.name Ip Information

Ip Country: United States
City Name: San Francisco
Latitude: 37.775699615479
Longitude: -122.39520263672

bigdata.andreamostosi.name Keywords accounting

Keyword Count

bigdata.andreamostosi.name Httpheader

Date: Sun, 14 Mar 2021 22:23:16 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d1301d3a82c0d47bdc87c0712f9b90e1c1615760596; expires=Tue, 13-Apr-21 22:23:16 GMT; path=/; domain=.andreamostosi.name; HttpOnly; SameSite=Lax
x-amz-id-2: LN0IxixeAy+nJsmpIXmX4pbT8rBJaAd9+4XotHTtJscXofou1N9kOKLJTD0nWxuHXlgs8u7jukg=
x-amz-request-id: 937M8YCT0WKYPZQN
last-modified: Sun, 05 Jun 2016 20:32:58 GMT
CF-Cache-Status: DYNAMIC
cf-request-id: 08d46f35e300000cd962275000000001
Report-To: "group":"cf-nel","endpoints":["url":"https:\\/\\/a.nel.cloudflare.com\\/report?s=Of1UK1woBmZWOp7dKC5pGvApNWlIQw6jQvyHGvxwSJUx8tz%2Fyp250RvLVakjGMbL9ouZ1O6%2F1tgC7wqooj4uFJOduAZusXTQNMiG%2FD1bjB4h0WylPFpMCqbpugdCWATFtTI8yuwDmg%3D%3D"],"max_age":604800
NEL: "max_age":604800,"report_to":"cf-nel"
Server: cloudflare
CF-RAY: 6300e7cfdf890cd9-EWR
Content-Encoding: gzip
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400

bigdata.andreamostosi.name Meta Info

charset="utf-8"/
content="chrome=1" http-equiv="X-UA-Compatible"/
content="The Big-Data Ecosystem Table" property="og:title"/
content="http://bigdata.andreamostosi.name/bigdata.jpg" property="og:image"/
content="Useful Stuff - Andrea Mostosi Blog" property="og:site_name"/
content="This page is built merging the Hadoop Ecosystem Table (by Javi Roman and other contributors) and projects list collected on my blog. Result is an incomplete-but-useful list of big-data related projects. If you like you can contribute to the original project or to my fork." property="og:description"/
content="summary" name="twitter:card"/
content="http://bigdata.andreamostosi.name" name="twitter:url"/
content="The Big-Data Ecosystem Table" name="twitter:title"/
content="@zenkay" name="twitter:creator"/
content="This page is built merging the Hadoop Ecosystem Table (by Javi Roman and other contributors) and projects list collected on my blog. Result is an incomplete-but-useful list of big-data related projects. If you like you can contribute to the original project or to my fork." name="twitter:description"/
content="http://bigdata.andreamostosi.name/bigdata.jpg" name="twitter:image"/

172.67.129.134 Domains

Domain WebSite Title

bigdata.andreamostosi.name Similar Website

Domain WebSite Title
bigdata.andreamostosi.nameThe Big-Data Ecosystem Table
bigdata.teradata.comBig Data | Big Data Analytics | Big Data Companies
bigdataanalytics.mit.eduData Science and Big Data Analytics: Making Data-Driven Decisions | MIT xPRO
qubole.comCloud Data Platform - Big Data Software | Qubole
discuss.analyticsvidhya.comData Science, Analytics and Big Data discussions
ttsa.teamapp.comTable Tennis SA (Table Tennis South Australia Inc.) Home page - Table Tennis team/club based in Adel
bigdatadefence.iqpc.comBig Data for Defence
bigdata.ieee.orgHome - IEEE Big Data
bcsurvey.wwbp.orgSmall Studies Big Data
businessintelligence.ittoolbox.comTopics Big Data Business Intelligence
learn.elephantscale.comElephant Scale - Training and Consulting in AI Big Data
posiq.netPosIQ - Restaurant CRM and Big Data for Restaurants
tecnologiaqueinteressa.comTecnologia que Interessa! Ciência de Dados, IA & Big Data
brooke.bigcuties.comHTTP://BROOKE.BIGCUTIES.COM - Big Cutie Brooke - Big Boobs Big Bottom Chunky & Plump BBW Big Cutie
spark.apache.orgApache Spark™ - Unified Analytics Engine for Big Data

bigdata.andreamostosi.name Traffic Sources Chart

bigdata.andreamostosi.name Alexa Rank History Chart

bigdata.andreamostosi.name aleax

bigdata.andreamostosi.name Html To Plain Text

Incomplete-but-useful list of big-data related projects packed into a JSON dataset. Github repository: https://github.com/zenkay/bigdata-ecosystem Raw JSON data: http://bigdata.andreamostosi.name/data.json Original page on my blog: http://blog.andreamostosi.name/big-data/ by Andrea Mostosi ( http://blog.andreamostosi.name ) Frameworks Apache Hadoop framework for distributed processing. Integrates MapReduce (parallel processing), YARN (job scheduling) and HDFS (distributed file system) 1. Apache Hadoop Distributed Programming AddThis Hydra Hydra is a distributed data processing and storage system originally developed at AddThis. It ingests streams of data (think log files) and builds trees that are aggregates, summaries, or transformations of the data. These trees can be used by humans to explore (tiny queries), as part of a machine learning pipeline (big queries), or to support live consoles on websites (lots of queries). 1. Github Akela Mozilla’s utility library for Hadoop, HBase, Pig, etc. 1. Website Amazon Lambda a compute service that runs your code in response to events and automatically manages the compute resources for you 1. Website Amazon SPICE Super-fast Parallel In-memory Calculation Engine 1. Website AMPcrowd A RESTful web service that runs microtasks across multiple crowds 1. Website AMPLab G-OLA a novel mini-batch execution model that generalizes OLA to support general OLAP queries with arbitrarily nested aggregates using efficient delta maintenance techniques 1. Website AMPLab SIMR Apache Spark was developed thinking in Apache YARN. However, up to now, it has been relatively hard to run Apache Spark on Hadoop MapReduce v1 clusters, i.e. clusters that do not have YARN installed. Typically, users would have to get permission to install Spark/Scala on some subset of the machines, a process that could be time consuming. SIMR allows anyone with access to a Hadoop MapReduce v1 cluster to run Spark out of the box. A user can run Spark directly on top of Hadoop MapReduce v1 without any administrative rights, and without having Spark or Scala installed on any of the nodes. 1. SIMR on GitHub Apache Crunch is a simple Java API for tasks like joining and data aggregation that are tedious to implement on plain MapReduce. The APIs are especially useful when processing data that does not fit naturally into relational model, such as time series, serialized object formats like protocol buffers or Avro records, and HBase rows and columns. For Scala users, there is the Scrunch API, which is built on top of the Java APIs and includes a REPL (read-eval-print loop) for creating MapReduce pipelines. 1. Website Apache DataFu DataFu provides a collection of Hadoop MapReduce jobs and functions in higher level languages based on it to perform data analysis. It provides functions for common statistics tasks (e.g. quantiles, sampling), PageRank, stream sessionization, and set and bag operations. DataFu also provides Hadoop jobs for incremental data processing in MapReduce. DataFu is a collection of Pig UDFs (including PageRank, sessionization, set operations, sampling, and much more) that were originally developed at LinkedIn. 1. DataFu Apache Incubator 2. LinkedIn DataFu Apache Flink high-performance runtime, and automatic program optimization 1. Website Apache Gora framework for in-memory data model and persistence 1. Apache Gora Apache Hama Apache Top-Level open source project, allowing you to do advanced analytics beyond MapReduce. Many data analysis techniques such as machine learning and graph algorithms require iterative computations, this is where Bulk Synchronous Parallel model can be more effective than “plain” MapReduce. 1. Hama site Apache Ignite high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time 1. Website Apache MapReduce MapReduce is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. Apache MapReduce was derived from Google MapReduce: Simplified Data Processing on Large Clusters paper. The current Apache MapReduce version is built over Apache YARN Framework. YARN stands for “Yet-Another-Resource-Negotiator”. It is a new framework that facilitates writing arbitrary distributed processing frameworks and applications. YARN’s execution model is more generic than the earlier MapReduce implementation. YARN can run applications that do not follow the MapReduce model, unlike the original Apache Hadoop MapReduce (also called MR1). Hadoop YARN is an attempt to take Apache Hadoop beyond MapReduce for data-processing. 1. Apache MapReduce 2. Google MapReduce paper 3. Writing YARN applications Apache Pig Pig provides an engine for executing data flows in parallel on Hadoop. It includes a language, Pig Latin, for expressing these data flows. Pig Latin includes operators for many of the traditional data operations (join, sort, filter, etc.), as well as the ability for users to develop their own functions for reading, processing, and writing data. Pig runs on Hadoop. It makes use of both the Hadoop Distributed File System, HDFS, and Hadoop’s processing system, MapReduce. 1. pig.apache.org/ 2. Pig examples by Alan Gates Apache S4 S4 is a general-purpose, distributed, scalable, fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data. 1. Apache S4 Apache Spark Data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. Spark fits into the Hadoop open-source community, building on top of the Hadoop Distributed File System (HDFS). However, Spark provides an easier to use alternative to Hadoop MapReduce and offers performance up to 10 times faster than previous generation systems like Hadoop MapReduce for certain applications. 1. Apache Incubator Spark Apache Spark Streaming framework for stream processing, part of Spark 1. Apache Spark Streaming Apache Storm Storm is a complex event processor and distributed computation framework written predominantly in the Clojure programming language. Is a distributed real-time computation system for processing fast, large streams of data. Storm is an architecture based on master-workers paradigma. So a Storm cluster mainly consists of a master and worker nodes, with coordination done by Zookeeper. 1. Storm Project/ 2. Storm-on-YARN Apache Tez Tez is a proposal to develop a generic application which can be used to process complex data-processing task DAGs and runs natively on Apache Hadoop YARN. 1. Apache Tez Apache Twill Twill is an abstraction over Apache Hadoop® YARN that reduces the complexity of developing distributed applications, allowing developers to focus more on their business logic. Twill uses a simple thread-based model that Java programmers will find familiar. YARN can be viewed as a compute fabric of a cluster, which means YARN applications like Twill will run on any Hadoop 2 cluster. 1. Apache Twill Incubator Arvados Spins a web of microservices around unsuspecting sysadmins 1. Website Blaze Python users high-level access to efficient computation on inconveniently large data 1. Website Cascalog data processing and querying library 1. Cascalog Cheetah High Performance, Custom Data Warehouse on Top of MapReduce 1. Paper Concurrent Cascading Application framework for Java developers to simply develop robust Data Analytics and Data Management applications on Apache Hadoop. 1. Cascanding Damballa Parkour Library for develop MapReduce programs using the LISP like language Clojure. Parkour aims to provide deep Clojure integration for Hadoop. Programs using Parkour are normal Clojure programs, using standard Clojure functions instead of new framework abstractions. Programs using Parkour are also full Hadoop programs, with complete access to absolutely everything possible in raw Java Hadoop MapReduce. 1. Parkour GitHub Project Datasalt Pangool A new MapReduce p...

bigdata.andreamostosi.name Whois

"domain_name_id": null, "domain_name": "ANDREAMOSTOSI.NAME", "registrar_id": null, "registrar": null, "registrant_id": null, "admin_id": null, "technical_id": null, "billing_id": null, "creation_date": null, "expiration_date": null, "updated_date": null, "name_server_ids": null, "name_servers": null, "status": "clientTransferProhibited https://icann.org/epp#clientTransferProhibited"