I feel like I may be missing something, as the only details I can find are the ones in the "Description" section. If that's all there is at this stage, that's fine; just making sure I'm not missing a document or something.
As for the plan, it seems reasonable, though I have a concern with the term "not external facing". Please, please make sure that as this service is developed, that it is done so with the assumption that it will be attacked with mild perseverance even though it's not external-facing. Denial-of-service attacks aren't really a concern, though attacks that attempt to subvert control flow or access restrictions are likely to be encountered. This includes things like SQL injection, CLI injection, and API fuzzing. The end-goal of the attacker may be (but not limited to) access to the collected records; tampering with the collected records; usurping any machines in the processing pipeline; or discrediting the work of XSEDE, its users, or the NSF.
We cannot assume that all users on all machines that can interact with this service are friendly. Even if there's an intermediate collector host with admin-only logins, the data flowing through it may contain user-supplied content (such as commands run, job names, etc.), all of which may contain hostile payloads.
Based on the description, I interpret this project as a collection of some level of detail of activity from all users of XSEDE resources for all time going forward. Please keep in mind that while, at some level public information (such as in aggregate by award), the specific information collected has the potential to be (ab)used in surprising, misleading, and generally unpleasant ways. If feasible, collecting aggregates in lieu of specifics may be preferable. It's less trouble than collecting specifics and trying to protect them from unauthorized access.