You guys do realize - all these calculations aren't really an IT problem OR something for a computer scientist. It's all really algebra and statistics.
In my opinion, introducing new topics to the discussion as new variables when they need to be kept as constants to provide any validity to the controlled test conditions, is a fallacious argument...not intended to really address the comparison, but to propose a whole new problem.
While I agree that there are a significant number of factors that could affect any given encounter between two players with these proposed builds, at some point you have to find a way to simplify the test or you'll never get it done. In my opinion, considering the similarity of these builds -- and an assumption of precisely equal player skill (more than that, we need to assume precisely the same responses) - what we're left looking at is a very concentrated distribution of results - swing by swing.
Nax