The status quo in source code analysis
While cybersecurity needs are now fully understood with regards to Cyber Criminality (just check the recent news to discover the latest cyber-attack), there is a major trend which now puts more emphasis on the necessity not only to build secure applications from the very beginning (Security By Design) but also to make sure that all sensitive data processed by the software is handled properly and its security properties (Integrity, confidentiality, etc.) are preserved all along the processing phases of the software (Privacy By Design when talking about personal data).
Applying those security & privacy by design principles to software applications is possible from the very first developments by using static analysis based scanners. They are very good at detecting potential flaws but unfortunately, actual state of the art show limitations of static analysis as the warnings it raises include numerous false positives (false positive rates are often higher than 70%). Identifying real flaws needs all the warnings to be manually investigated, which is time consuming and often impossible in a real development process, due to time and budget constraints of project teams.
Another point resulting from actual state of the art is that static analysis works very close to the programming language and it is therefore quite difficult to take the user context into account in vulnerability analysis, as well as the application’s one.
Yagaan software: addressing current challenges in source code analysis while bringing in new capabilities
We created YAGAAN to overcome those difficulties. The Yag-Suite is our innovative solution for application security which brings new capabilities to source code analysis. Its aim is, through a decision-making approach, to support developers and reviewers in their efficient targeting of the source code vulnerabilities which are the most relevant to fix. The YAG-Suite merges artificial intelligence to usual static analysis techniques. The first benefits of merging those technologies is to bring machine learning capabilities to existing scanners, such as open source ones for instance.
To address the false positive issue, the YAG-Suite will be used in addition to an existing scanner. Lets take PHP code sniffer (PHPCS) as an example: PHPCS will be run on the target application to review and the YAG-Suite will process the PHPCS warnings to automate the previously manual false positive reduction.
As the YAG-Suite is based on a supervised approach, a first step of training is needed. The user will review a limited subset of warnings, as usual, but for each of them will tell the YAG-Suite if the warning is about a real vulnerability or a false positive. The YAG-Suite will learn from it and then will be able to automate the analysis of all the remaining warnings. It allocates every warning a relevancy criterion comprised between 0% and 100%. The higher the criterion, the most relevant is the warning: 0% indicates that the related PHPCS warning is a false positive while 100% will point a true positive. As a result, all PHPCS warnings will be sorted in a prioritized list from the highest relevancy (e.g. true positives) to the lowest (e.g. false positives). This helps the reviewer focus on most critical issues much faster. Of course, all along the reviewing process, the user can go on feeding the machine learning with new cases in order to make it more accurate when needed.
Another benefit of using artificial intelligence is to extend the machine learning capabilities to integrate the user feedback on the impact of the potential flow in terms of confidentiality, integrity and availability. Doing so, the analysis can be fine-tuned to the context of the application, and to the user’s human evaluation of the flaw. As a result, the YAG-Suite includes the impact analysis of each warning on those criteria (confidentiality, integrity and availability). This gives the solution a unique ability to calculate the CVSS score of each warning, based on its own C/I/A criticality assessment rather than based on a generic score for the type of vulnerability.
Last but not least, automated learning from user feedback allows to learn and reproduce the non-quantifiable human expertise of the user and his risk perception of the users in the analysis. As such, the YAG-Suite self-adapts to the user’s business specific context, and gets more specialized over time.
The final output of the YAG-Suite is the entire list of the warnings from PHPCS (the YAG-Suite don’t delete any warning) where each warning is attached a contextualized relevance attribute, a CVSS individual score as well as a Weighted CVSS score which ponder the CVSS score with its relevance. For instance, for two warnings having a similar CVSS score, if one of them is a false positive its wcvss score will be much lower than the true positive one.
Finally, the YAG-Suite will show the user a fully prioritized list of the warnings released by PHPCS and help him put the fixing effort on the most critical flaws.
The YAG-Suite actually supports Java and PHP languages.
YAGAAN’s founding team is composed of Hervé Le Goff, CEO, and Antoine Floch. Hervé worked for 25 years at “Direction Générale pour l’Armement” (DGA, French Ministry of the Armies) in the field of technical expertise (including information security), human and project management and international cooperation.
Antoine, PhD in compilation and operational research, is co-founder of YAGAAN. He has been working for 10 years on different source code static analysis methods and tool.