Hal Hodson, technology reporter
Classifying different kinds of malware is notoriously hard, but crucial if computer defences are to keep up with the ever-evolving ecosystem of malicious programs. Treating computer viruses as biological puzzle could help computer scientists get a better handle on the wide world of malware.
Ajit Narayanan and Yi Chen at the Auckland University of Technology, New Zealand, converted the signatures of 120 worms and viruses into an amino acid representation. The signatures are more usually presented in hexadecimals - a base-16 numbering system which uses the digits 0 to 9 as well as the letters a to f - but the amino acid "alphabet" is better suited to machine-learning techniques that can analyse a piece of code to figure out whether it matches a known malware signature.
Generally, malware experts identify and calculate the signatures of new malware, but it can be hard for them keep up. While machine learning can help, it is limited because the hexadecimal signatures can be different lengths: Narayanan's team found that using machine learning to help classify the hexadecimal malware signatures resulted in accuracy no better than flipping a coin.
But some techniques used in bioinformatics for comparing amino acid sequences take differing lengths into account. After applying these to malware, Narayanan's average accuracy for classifying the signatures automatically using machine learning rose to 85 per cent.
Biology might help in other ways too. Narayanan notes that if further study shows malware evolution follows some of the same rules as amino acids and proteins, our knowledge of biological systems could be used to help fight it.
Classifying different kinds of malware is notoriously hard, but crucial if computer defences are to keep up with the ever-evolving ecosystem of malicious programs. Treating computer viruses as biological puzzle could help computer scientists get a better handle on the wide world of malware.
Ajit Narayanan and Yi Chen at the Auckland University of Technology, New Zealand, converted the signatures of 120 worms and viruses into an amino acid representation. The signatures are more usually presented in hexadecimals - a base-16 numbering system which uses the digits 0 to 9 as well as the letters a to f - but the amino acid "alphabet" is better suited to machine-learning techniques that can analyse a piece of code to figure out whether it matches a known malware signature.
Generally, malware experts identify and calculate the signatures of new malware, but it can be hard for them keep up. While machine learning can help, it is limited because the hexadecimal signatures can be different lengths: Narayanan's team found that using machine learning to help classify the hexadecimal malware signatures resulted in accuracy no better than flipping a coin.
But some techniques used in bioinformatics for comparing amino acid sequences take differing lengths into account. After applying these to malware, Narayanan's average accuracy for classifying the signatures automatically using machine learning rose to 85 per cent.
Biology might help in other ways too. Narayanan notes that if further study shows malware evolution follows some of the same rules as amino acids and proteins, our knowledge of biological systems could be used to help fight it.
0 comments:
Post a Comment