EP-1765: Big data or good data? Improving the quality of big data by open source clinical research protocols