Lawrence Berkeley National Laboratory Recent Work Title SpaRC : scalable sequence clustering using Apache Spark