And so it begins

Today I start my Major Project; a compulsory part of the BEng Software Engineering course at Aberystwyth University, and our equivalent of the ‘dissertation’ you might write in another department.

Though the bulk of the work is in planning and implementing a major piece of technical effort, there is also the small matter of the 20,000 word accompanying report, which dwarfs most other dissertations. As such, it’s a little frustrating that I can’t really refer to this piece as a dissertation, and have to give a paragraph-sized explanation to make explicit the amount of work a ‘major project’ actually involves. But that aside…

We’ve been advised to keep a diary of our progress throughout the project, which should come in handy when formulating the final report. Hence, here is my diary. I’m looking forward to finding the time to actually write blog posts on my WordPress site, which I’ve neglected in recent months.

Preamble finished, let’s get on to the main meat of the post…

Online Dispute Resolution for Maritime Collisions

My project, titled “Online Dispute Resolution for Maritime Collisions”, will automate the process of estimating the likelihood of success or defeat of a given maritime collision case. It will use machine learning to look at previous maritime collision cases (which will have been pre-fed into the system), compare these cases to the current case, and outline an approximate percentage chance of successful prosecution or defence. It will be an aid for lawyers, giving an overall idea of the complexity of the case without manual and laborious examination. It will also highlight similar cases, which the lawyers can use in forming their cases in court.

This is my understanding of the project as it currently stands, though I have not yet had the chance to meet with my supervisor. I hope to meet with him early this week to clarify my understanding of the project and the expected deliverables.

Outline Project Specification

By February 6th, I need to submit an Outline Project Specification detailing the proposed tasks and project deliverables I expect to complete as part of my project. I will be spending the next few days on this, and it will be somewhat shaped by the supervisor meeting(s) I hope to have this week. However, a preliminary overview:

Proposed Tasks

Get to grips with maritime law. This is supposedly quite a simple, concise law that should be translatable to code. I’ll need to spend time learning the ins and outs of the law so that I can build the software and validate that it is behaving correctly.
Gather historical maritime law cases, in a consistent format. This may be very sizeable, if collections don’t already exist in the public domain. Or it may be quite simple, if my supervisor is able to provide me with cases.
Process historical maritime law cases. I need cases so that they can be fed into the software and used in the machine learning process. This step may require natural language processing, scraping and parsing techniques, etc. I think it’s unlikely that details of the cases will be in a nice, consistent XML format!
Refine a processing algorithm – likely to be the most difficult part of the project. I need to write code that takes case details, compares the details to historical cases, and accurately returns similar cases and an indicator as to the viability of a case’s success in court.
Develop an interface. I expect the above stages to be low-level, passing command-line parameters and calling scripts directly, etc. As this should be online dispute resolution, this indicates the need for a web application, simple enough for untrained users to use, but with enough scope to input all of the intricacies of a legal case. Again, this might be quite sizeable – it may require user registration and authorisation, the ability to view historical legal case PDFs, advanced search capabilities, etc. This is where the scope will need to be carefully controlled, so that the project can stay on track.
Explore the ethical issues raised. My project will give an indication of the legal outcome of a case and highlight related cases that a lawyer could use in presenting their case. This may affect whether or not a lawyer agrees to take on a case in the first place. Do I want to develop something that could one day be a factor in turning away vulnerable people from the legal protection of their choice? Would it be possible for automated resolutions to replace a judge and jury? Could this be less biased than traditional law? It is possible that ethics may form a sizable portion of my major project report. To consider the issues analytically, I will draw upon the ethical lessons learned in CS22310. This may mean that I protect my project with a prohibitive license, drawing on information learned in CS38110. This is something that will need to be discussed with the customer early on in the process.
Explore the commercial viability of the project. This project looks exciting, and genuinely useful. If it presents a viable commercial opportunity, I’ll need to examine how this might be achieved, outline this in my final report, perhaps implement some of the functional requirements of such a commercial system, and also explore the legal and copyright issues that might be raised, given that I am doing work alongside a supervisor and as part of my degree at Aberystwyth University.

Project Deliverables

Choice of server. An online project needs a server. I’m going to need to investigate the possibilities – localhost, Amazon Web Services, Digital Ocean, JavaEE, standalone server, etc – and document a report formalising my conclusions. This can be added to the appendices of my final report.
Choice of processing language for the algorithm. In writing the algorithm that performs the machine learning, I may choose to use a language well suited to data manipulation (such as Python or R), or if it becomes complex enough I may feel that Java or C++ might be more suitable. The algorithm implementation should be separate from the web application, and should be relatively easy to execute even from another language.
Choice of server-side language for the application. Note: this may be different to the choice of processing language for the algorithm. Likely candidates include PHP (most popular), Node (more consistency between back-end and the JavaScript front-end), etc. All three of these deliverables thus far will be in the form of short (2-3 pages) reports that will form some of my appendices.
Choice of build tool. Depends on my choice of algorithm/server-side language. I may use Ant or Mavern if I end up using Java, or Grunt/Gulp if using JavaScript and Node.
Choice of version management system. I’m already strongly leaning towards Git, as this is what I use regularly and am familiar with, but it may be worth writing a report to justify my choice of Git over, say, SVN.
Problem domain specification. Having read and understood what there is to know about maritime law, using historical and current case details as a guide, and working closely with a contact in law, I’ll need to define a set of business rules to be translated into code.
Interface design. I hope to enlist the help of a designer to come up with an attractive website design that is in keeping with the style of the project. I’ll use this design as a guideline in developing the front-end interface.
Stories or features. Currently undecided as to the project process I’ll follow (e.g. Agile, Feature-Driven, etc), but I need to outline some form of requirements. This may be a traditional requirements specification, but it is more likely to be a collection of features and/or stories which break the project up into manageable chunks. I hope to use something like Cucumber to enable BDD and make these features ‘executable’.
Test suites. I will develop in a test-driven way. A by-product of this methodology is that I will have a collection of test suites that serve as living documentation of the code.
Milestone 1: working algorithm. Though it may be tempting to jump ahead and start making a cool website, the website will be nothing without the underlying algorithm that can predict the outcome of cases. I aim to have such an algorithm working before I make any steps to deploy it to a web environment. At this stage, I expect to pass parameters via command line or file, run the algorithm, and see my results. It will take a while to get this far.
Milestone 2: working website. A basic website that has overcome the challenges of deployment. I should be able to automatically deploy the latest version of my codebase to the server of choice, and as a consumer I should be able to access the website, pass in details of my case and view the results, the algorithm having been triggered server-side.
Milestone 3: expanded website. Should look aesthetically pleasing, and may have additional functionality such as advanced search, etc.
Milestone 4: commercial implementation. If the project is deemed to be commercially viable, I’d like to implement some of the requirements of an e-commerce platform; namely user registration, authentication and authorisation, account managements, payment services, etc.
JIRA tickets. I’ll use my own installation of JIRA for project management (managing deadlines, delegating tasks, etc). For code management, including bug fixes, refactoring, and so on, I intend to use GitHub Issues. Both of these systems will provide valuable project history that would be worthy of including in the appendices.
Final report. I hope to have the project completed in plenty of time so that I can concentrate on my final report, which will be complete with many of the above artefacts.