Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program
暂无分享,去创建一个
Abstract : Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g. tabular, relational, categorical, meta-data) and unstructured (e.g. text, documents, message traffic). Several open source project which have seen community and industry adoption grew out of this effort. - Blaze: A collection packages for describing and accessing, and manipulating disparate data sources and types - Numba: A just-in-time function compiler for Python, based on LLVM compiler project allowing researchers to run their Python code near native speeds on CPUs and GPUs. - Dask: Parallelizes generic Python and extends NumPy, Pandas, and Scikit-learn with parallel variants. -Bokeh: Create interactive web applications from Python without having to know Javascript, CSS, or HTML.