On today’s call we discussed our proposed refactoring of the Malware Instance data model for MAEC 5.0. Most of this is around grouping the various properties into buckets:
·Analysis metadata properties: tools that were used during the analysis of the Malware Instance, analysts, etc. We’re still working through the specifics of what we want to capture here and will bring
it up in future discussions.
·Metadata: general metadata about the Malware Instance, such as names, labels, when it was first seen, etc.
·Behavioral Features: features associated with the semantics of the code that the malware executes. This is represented by the MAEC Behavior, Action, and Process Tree entities.
·Static Features: features associated with the binary that aren’t related to the semantics of its code, such as strings, packer information, etc.
·Capabilities: high-level abilities possessed by the Malware Instance, such as persistence, anti-vm, etc. We’ve developed a vocabulary for these.
·OS Features: operating-system specific features used by the Malware Instance, such as named pipes on Windows. We’re still working through the specifics of this, but our thought was that we could
have vocabularies for each OS class (Windows, Android, Linux, etc.)
·Structural Features: features associated with how the code is structured in the binary, such as import address table obfuscation. Most of these seem to deal with various types of obfuscation, so
perhaps we should really capture this as such. Like for Capabilities, we also have a vocabulary that captures these.
This is still a work in progress, and we’re interested in hearing any feedback you may have: do the high level categories make sense? Are there other things we should be capturing (for each category or generally)?
For instance, I think we can probably capture any certificates used to sign the binary (e.g., using Authenticode) under the Static Features bucket.