seeking a schema to disclose static analysis results

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

seeking a schema to disclose static analysis results

jose nazario
some years ago for work i wrote an auto static analyzer, which takes memory dumps and extracts out the useful tidbits. it's a significant time saver at best, at worst a 2 minute detour.

the tool's output is flat text files right now, and i'd like to integrate it better into our infrastructure by using a structured schema. i'd also like to not loose any info, and i know that some of the schemas out there would drop useful data on the floor.

the info captured and reported includes:

- file hashes (md5, sha1, sha256, etc, plus fuzzy hashes)
- AV detection
- file type
- file name
- file size
- packer(s) seen
- RAR contents, CAB contents, ZIP contents (as appropriate)
- sigcheck output
- environment/vmm/sandbox detection indicators
- embedded urls
- embedded url elements
- embedded filenames
- embedded registry keys
- base64 decode strings
- embedded IPs
- embedded hostnames
- embedded email addresses
- candidate behaviors (e.g. HTTP client, keystroke logger, etc)
- PE info
- imports and APIs called

some of these appear to be covered by things like CyBox, IEEE malware metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop any of the above on the floor.

any suggestions welcome, otherwise i'll get going on my own schema.

_____________________________
jose nazario, ph.d. [hidden email]
sr. manager of security research, arbor networks
blog:    http://asert.arbor.net/
twitter: @arbornetworks
Reply | Threaded
Open this post in threaded view
|

RE: seeking a schema to disclose static analysis results

Darien Kindlund
Hi Jose,

My understanding is that the MAEC schema is designed to provide output from dynamic analysis environments -- not from static analysis, as a separate schema exists to capture that information.  Ivan, can you let Jose know which schema would be appropriate for capturing his static analysis information?  I believe improvements to the static analysis schema are still accepted, so there's a possibility for your elements to be included, if they don't already exist, accordingly.

Regards,
-- Darien

--
Darien Kindlund
Sr. Staff Scientist
Direct: +1 (703) 608-8749 | Fax: +1 (408) 321-9818
Email: [hidden email]

FireEye, Inc.
Next Generation Threat Protection
1420 Beverly Rd. #150, McLean, VA 22101
http://www.FireEye.com


> -----Original Message-----
> From: [hidden email] [mailto:owner-maec-
> [hidden email]] On Behalf Of Jose Nazario
> Sent: Thursday, August 23, 2012 10:54 AM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: seeking a schema to disclose static analysis results
>
> some years ago for work i wrote an auto static analyzer, which takes memory
> dumps and extracts out the useful tidbits. it's a significant time saver at
> best, at worst a 2 minute detour.
>
> the tool's output is flat text files right now, and i'd like to integrate it
> better into our infrastructure by using a structured schema. i'd also like
> to not loose any info, and i know that some of the schemas out there would
> drop useful data on the floor.
>
> the info captured and reported includes:
>
> - file hashes (md5, sha1, sha256, etc, plus fuzzy hashes)
> - AV detection
> - file type
> - file name
> - file size
> - packer(s) seen
> - RAR contents, CAB contents, ZIP contents (as appropriate)
> - sigcheck output
> - environment/vmm/sandbox detection indicators
> - embedded urls
> - embedded url elements
> - embedded filenames
> - embedded registry keys
> - base64 decode strings
> - embedded IPs
> - embedded hostnames
> - embedded email addresses
> - candidate behaviors (e.g. HTTP client, keystroke logger, etc)
> - PE info
> - imports and APIs called
>
> some of these appear to be covered by things like CyBox, IEEE malware
> metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop
> any of the above on the floor.
>
> any suggestions welcome, otherwise i'll get going on my own schema.
>
> _____________________________
> jose nazario, ph.d. [hidden email]
> sr. manager of security research, arbor networks
> blog:    http://asert.arbor.net/
> twitter: @arbornetworks

______________________________________________________________________
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|

Re: seeking a schema to disclose static analysis results

David Kovar
Greetings,

I am working on a similar effort and am stuffing the results into a
database. I'd love to find a good interchange format and was looking
at OpenIOC but am not wedded to it yet.

-David


On Thu, Aug 23, 2012 at 10:03 AM, Darien Kindlund
<[hidden email]> wrote:

> Hi Jose,
>
> My understanding is that the MAEC schema is designed to provide output from dynamic analysis environments -- not from static analysis, as a separate schema exists to capture that information.  Ivan, can you let Jose know which schema would be appropriate for capturing his static analysis information?  I believe improvements to the static analysis schema are still accepted, so there's a possibility for your elements to be included, if they don't already exist, accordingly.
>
> Regards,
> -- Darien
>
> --
> Darien Kindlund
> Sr. Staff Scientist
> Direct: +1 (703) 608-8749 | Fax: +1 (408) 321-9818
> Email: [hidden email]
>
> FireEye, Inc.
> Next Generation Threat Protection
> 1420 Beverly Rd. #150, McLean, VA 22101
> http://www.FireEye.com
>
>
>> -----Original Message-----
>> From: [hidden email] [mailto:owner-maec-
>> [hidden email]] On Behalf Of Jose Nazario
>> Sent: Thursday, August 23, 2012 10:54 AM
>> To: maec-discussion-list Malware Attribute Enumeration Discussion
>> Subject: seeking a schema to disclose static analysis results
>>
>> some years ago for work i wrote an auto static analyzer, which takes memory
>> dumps and extracts out the useful tidbits. it's a significant time saver at
>> best, at worst a 2 minute detour.
>>
>> the tool's output is flat text files right now, and i'd like to integrate it
>> better into our infrastructure by using a structured schema. i'd also like
>> to not loose any info, and i know that some of the schemas out there would
>> drop useful data on the floor.
>>
>> the info captured and reported includes:
>>
>> - file hashes (md5, sha1, sha256, etc, plus fuzzy hashes)
>> - AV detection
>> - file type
>> - file name
>> - file size
>> - packer(s) seen
>> - RAR contents, CAB contents, ZIP contents (as appropriate)
>> - sigcheck output
>> - environment/vmm/sandbox detection indicators
>> - embedded urls
>> - embedded url elements
>> - embedded filenames
>> - embedded registry keys
>> - base64 decode strings
>> - embedded IPs
>> - embedded hostnames
>> - embedded email addresses
>> - candidate behaviors (e.g. HTTP client, keystroke logger, etc)
>> - PE info
>> - imports and APIs called
>>
>> some of these appear to be covered by things like CyBox, IEEE malware
>> metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop
>> any of the above on the floor.
>>
>> any suggestions welcome, otherwise i'll get going on my own schema.
>>
>> _____________________________
>> jose nazario, ph.d. [hidden email]
>> sr. manager of security research, arbor networks
>> blog:    http://asert.arbor.net/
>> twitter: @arbornetworks
>
> ______________________________________________________________________
> This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
> ______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|

RE: seeking a schema to disclose static analysis results

Kirillov, Ivan A.
In reply to this post by Darien Kindlund
Darien,

That's partially correct - the current & previous versions of MAEC have been targeted primarily towards dynamic analysis, though MAEC is intended to be an encompassing format for all analysis types, and we're hoping to expand our coverage of static analysis in the near future. At the moment most of static analysis coverage comes through the CybOX objects, particularly the Windows Executable File Object which provides the ability to encode the information about PE file structure and headers.

Jose - I believe the MAEC Bundle, which is intended to encompass the analysis results for a malware instance would likely be suitable for use as schema to capture the results of your tool. I've tried to identify in your list which schemas/objects can be used to characterize each particular entry; if you could send me a sample txt output file, I can generate a sample MAEC Bundle representation for you.

Regards,
Ivan

Ivan Kirillov
MAEC Project
The MITRE Corporation

-----Original Message-----
From: Darien Kindlund [mailto:[hidden email]]
Sent: Thursday, August 23, 2012 11:04 AM
To: Jose Nazario; Kirillov, Ivan A.
Cc: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: seeking a schema to disclose static analysis results

Hi Jose,

My understanding is that the MAEC schema is designed to provide output from dynamic analysis environments -- not from static analysis, as a separate schema exists to capture that information.  Ivan, can you let Jose know which schema would be appropriate for capturing his static analysis information?  I believe improvements to the static analysis schema are still accepted, so there's a possibility for your elements to be included, if they don't already exist, accordingly.

Regards,
-- Darien

--
Darien Kindlund
Sr. Staff Scientist
Direct: +1 (703) 608-8749 | Fax: +1 (408) 321-9818
Email: [hidden email]

FireEye, Inc.
Next Generation Threat Protection
1420 Beverly Rd. #150, McLean, VA 22101
http://www.FireEye.com


> -----Original Message-----
> From: [hidden email] [mailto:owner-maec-
> [hidden email]] On Behalf Of Jose Nazario
> Sent: Thursday, August 23, 2012 10:54 AM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: seeking a schema to disclose static analysis results
>
> some years ago for work i wrote an auto static analyzer, which takes memory
> dumps and extracts out the useful tidbits. it's a significant time saver at
> best, at worst a 2 minute detour.
>
> the tool's output is flat text files right now, and i'd like to integrate it
> better into our infrastructure by using a structured schema. i'd also like
> to not loose any info, and i know that some of the schemas out there would
> drop useful data on the floor.
>
> the info captured and reported includes:
>
> - file hashes (md5, sha1, sha256, etc, plus fuzzy hashes) [MAEC/CyboX File Object]
> - AV detection [MAEC/CybOX File Object]
> - file type [MAEC/CyboX File Object]
> - file name [MAEC/CyboX File Object]
> - file size [MAEC/CyboX File Object]
> - packer(s) seen [MAEC/CyboX File Object]
> - RAR contents, CAB contents, ZIP contents (as appropriate)
> - sigcheck output
> - environment/vmm/sandbox detection indicators
> - embedded urls [MAEC/CybOX URL Object]
> - embedded url elements [MAEC/CyboX URL Object]
> - embedded filenames [MAEC/CyboX File Object]
> - embedded registry keys [MAEC/CyboX Registry Object]
> - base64 decode strings
> - embedded IPs [MAEC/CyboX Address Object]
> - embedded hostnames [MAEC/CyboX System Object]
> - embedded email addresses [MAEC/CyboX Email Object]
> - candidate behaviors (e.g. HTTP client, keystroke logger, etc) [MAEC]
> - PE info [MAEC/CyboX Win Executable File Object]
> - imports and APIs called [MAEC/CyboX Win Executable File Object]
>
> some of these appear to be covered by things like CyBox, IEEE malware
> metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop
> any of the above on the floor.
>
> any suggestions welcome, otherwise i'll get going on my own schema.
>
> _____________________________
> jose nazario, ph.d. [hidden email]
> sr. manager of security research, arbor networks
> blog:    http://asert.arbor.net/
> twitter: @arbornetworks

______________________________________________________________________
This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|

RE: seeking a schema to disclose static analysis results

Kirillov, Ivan A.
In reply to this post by David Kovar
David,

Similarly to what I said in response to Jose, if you can send me a listing or example of the data elements that you capture, I can provide you with a mapping or example of its representation in MAEC/CybOX.

MAEC & CybOX are more expressive and a superset of OpenIOC, so they should be able to capture a much wider variety of information.

Also, as Darien said we're definitely looking for input as far as static-analysis specific information (e.g. entropy, function hashing, etc.) that we should capture in MAEC.

Regards,
Ivan

Ivan Kirillov
MAEC Project
The MITRE Corporation

-----Original Message-----
From: David Kovar [mailto:[hidden email]]
Sent: Thursday, August 23, 2012 11:14 AM
To: Darien Kindlund
Cc: Jose Nazario; Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: Re: seeking a schema to disclose static analysis results

Greetings,

I am working on a similar effort and am stuffing the results into a
database. I'd love to find a good interchange format and was looking
at OpenIOC but am not wedded to it yet.

-David


On Thu, Aug 23, 2012 at 10:03 AM, Darien Kindlund
<[hidden email]> wrote:

> Hi Jose,
>
> My understanding is that the MAEC schema is designed to provide output from dynamic analysis environments -- not from static analysis, as a separate schema exists to capture that information.  Ivan, can you let Jose know which schema would be appropriate for capturing his static analysis information?  I believe improvements to the static analysis schema are still accepted, so there's a possibility for your elements to be included, if they don't already exist, accordingly.
>
> Regards,
> -- Darien
>
> --
> Darien Kindlund
> Sr. Staff Scientist
> Direct: +1 (703) 608-8749 | Fax: +1 (408) 321-9818
> Email: [hidden email]
>
> FireEye, Inc.
> Next Generation Threat Protection
> 1420 Beverly Rd. #150, McLean, VA 22101
> http://www.FireEye.com
>
>
>> -----Original Message-----
>> From: [hidden email] [mailto:owner-maec-
>> [hidden email]] On Behalf Of Jose Nazario
>> Sent: Thursday, August 23, 2012 10:54 AM
>> To: maec-discussion-list Malware Attribute Enumeration Discussion
>> Subject: seeking a schema to disclose static analysis results
>>
>> some years ago for work i wrote an auto static analyzer, which takes memory
>> dumps and extracts out the useful tidbits. it's a significant time saver at
>> best, at worst a 2 minute detour.
>>
>> the tool's output is flat text files right now, and i'd like to integrate it
>> better into our infrastructure by using a structured schema. i'd also like
>> to not loose any info, and i know that some of the schemas out there would
>> drop useful data on the floor.
>>
>> the info captured and reported includes:
>>
>> - file hashes (md5, sha1, sha256, etc, plus fuzzy hashes)
>> - AV detection
>> - file type
>> - file name
>> - file size
>> - packer(s) seen
>> - RAR contents, CAB contents, ZIP contents (as appropriate)
>> - sigcheck output
>> - environment/vmm/sandbox detection indicators
>> - embedded urls
>> - embedded url elements
>> - embedded filenames
>> - embedded registry keys
>> - base64 decode strings
>> - embedded IPs
>> - embedded hostnames
>> - embedded email addresses
>> - candidate behaviors (e.g. HTTP client, keystroke logger, etc)
>> - PE info
>> - imports and APIs called
>>
>> some of these appear to be covered by things like CyBox, IEEE malware
>> metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop
>> any of the above on the floor.
>>
>> any suggestions welcome, otherwise i'll get going on my own schema.
>>
>> _____________________________
>> jose nazario, ph.d. [hidden email]
>> sr. manager of security research, arbor networks
>> blog:    http://asert.arbor.net/
>> twitter: @arbornetworks
>
> ______________________________________________________________________
> This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
> ______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|

Re: seeking a schema to disclose static analysis results

David Kovar
Greetings,

I have multiple tables in the database but here are the two that are
most relevant.

A table entry describing a single malware sample:

        # Create table describing individual malware sample
        malware_table_sql = """CREATE TABLE malware
        (
        `md5` char(32) NOT NULL,     # md5 hash - primary key
        `tag` varchar(64),           # user defined tags
        `type` varchar(32),          # malware type - exe, dll, js,
etc
        `platform` varchar(24),      # Platform targeted
        `size` int(8),               # file size
        `sha1` char(64),             # sha1 hash
        `sha256` char(64),           # sha256 hash
        `fuzzy_hash` char(96),       # fuzzy hash
        `fuzzy_match` char(96),      # match in fuzzy hash database
        `entropy` float,             # entropy
        `zoo_date` datetime,         # date entered in malware zoo
        `zoo_name` varchar(255),     # zoo file name of sample (most often hash)
        `comment` varchar(255),      # user comments on sample
        `do_not_dup` int,            # do not duplicate beyond this zoo
        UNIQUE (`sha1`),
        PRIMARY KEY (`md5`)
        )"""


A table entry describing malware metadata. This could be part of the
malware sample entry itself.

        sources_table_sql = """CREATE TABLE sources
        (
        `md5` char(32) NOT NULL,
        `original_name` char,       # original name of the file
        `tag` varchar(64),          # Free form, user tag
        `source` varchar(64),       # Source of sample - email, web site, ....
        `location` varchar(256),    # Full path, IP address, system, ....
        `collection_date` datetime, # Date collected
        `case_str` varchar(24),     # Engagement/case id
        `case_id` int(8),           # Pointer to case record (future use)
        `comment` varchar(256),
        PRIMARY KEY (`md5`),
        FOREIGN KEY (`md5`) REFERENCES malware(`md5`)
        )"""

-David

On Thu, Aug 23, 2012 at 10:22 AM, Kirillov, Ivan A. <[hidden email]> wrote:

> David,
>
> Similarly to what I said in response to Jose, if you can send me a listing or example of the data elements that you capture, I can provide you with a mapping or example of its representation in MAEC/CybOX.
>
> MAEC & CybOX are more expressive and a superset of OpenIOC, so they should be able to capture a much wider variety of information.
>
> Also, as Darien said we're definitely looking for input as far as static-analysis specific information (e.g. entropy, function hashing, etc.) that we should capture in MAEC.
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Project
> The MITRE Corporation
>
> -----Original Message-----
> From: David Kovar [mailto:[hidden email]]
> Sent: Thursday, August 23, 2012 11:14 AM
> To: Darien Kindlund
> Cc: Jose Nazario; Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: Re: seeking a schema to disclose static analysis results
>
> Greetings,
>
> I am working on a similar effort and am stuffing the results into a
> database. I'd love to find a good interchange format and was looking
> at OpenIOC but am not wedded to it yet.
>
> -David
>
>
> On Thu, Aug 23, 2012 at 10:03 AM, Darien Kindlund
> <[hidden email]> wrote:
>> Hi Jose,
>>
>> My understanding is that the MAEC schema is designed to provide output from dynamic analysis environments -- not from static analysis, as a separate schema exists to capture that information.  Ivan, can you let Jose know which schema would be appropriate for capturing his static analysis information?  I believe improvements to the static analysis schema are still accepted, so there's a possibility for your elements to be included, if they don't already exist, accordingly.
>>
>> Regards,
>> -- Darien
>>
>> --
>> Darien Kindlund
>> Sr. Staff Scientist
>> Direct: +1 (703) 608-8749 | Fax: +1 (408) 321-9818
>> Email: [hidden email]
>>
>> FireEye, Inc.
>> Next Generation Threat Protection
>> 1420 Beverly Rd. #150, McLean, VA 22101
>> http://www.FireEye.com
>>
>>
>>> -----Original Message-----
>>> From: [hidden email] [mailto:owner-maec-
>>> [hidden email]] On Behalf Of Jose Nazario
>>> Sent: Thursday, August 23, 2012 10:54 AM
>>> To: maec-discussion-list Malware Attribute Enumeration Discussion
>>> Subject: seeking a schema to disclose static analysis results
>>>
>>> some years ago for work i wrote an auto static analyzer, which takes memory
>>> dumps and extracts out the useful tidbits. it's a significant time saver at
>>> best, at worst a 2 minute detour.
>>>
>>> the tool's output is flat text files right now, and i'd like to integrate it
>>> better into our infrastructure by using a structured schema. i'd also like
>>> to not loose any info, and i know that some of the schemas out there would
>>> drop useful data on the floor.
>>>
>>> the info captured and reported includes:
>>>
>>> - file hashes (md5, sha1, sha256, etc, plus fuzzy hashes)
>>> - AV detection
>>> - file type
>>> - file name
>>> - file size
>>> - packer(s) seen
>>> - RAR contents, CAB contents, ZIP contents (as appropriate)
>>> - sigcheck output
>>> - environment/vmm/sandbox detection indicators
>>> - embedded urls
>>> - embedded url elements
>>> - embedded filenames
>>> - embedded registry keys
>>> - base64 decode strings
>>> - embedded IPs
>>> - embedded hostnames
>>> - embedded email addresses
>>> - candidate behaviors (e.g. HTTP client, keystroke logger, etc)
>>> - PE info
>>> - imports and APIs called
>>>
>>> some of these appear to be covered by things like CyBox, IEEE malware
>>> metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop
>>> any of the above on the floor.
>>>
>>> any suggestions welcome, otherwise i'll get going on my own schema.
>>>
>>> _____________________________
>>> jose nazario, ph.d. [hidden email]
>>> sr. manager of security research, arbor networks
>>> blog:    http://asert.arbor.net/
>>> twitter: @arbornetworks
>>
>> ______________________________________________________________________
>> This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
>> ______________________________________________________________________
Reply | Threaded
Open this post in threaded view
|

Re: seeking a schema to disclose static analysis results

jose nazario
david

a suggestion: move the hashes (crypto hashes and fuzzy hashes) to their own table with arbitrary hashes, hash types (md5, sha, sha256, ssdeep, etc), and a foreign key to the sample. we do this and find great flexibility over many years.

On Aug 24, 2012, at 8:44 AM, David Kovar wrote:

> Greetings,
>
> I have multiple tables in the database but here are the two that are
> most relevant.
>
> A table entry describing a single malware sample:
>
>        # Create table describing individual malware sample
>        malware_table_sql = """CREATE TABLE malware
>        (
>        `md5` char(32) NOT NULL,     # md5 hash - primary key
>        `tag` varchar(64),           # user defined tags
>        `type` varchar(32),          # malware type - exe, dll, js,
> etc
>        `platform` varchar(24),      # Platform targeted
>        `size` int(8),               # file size
>        `sha1` char(64),             # sha1 hash
>        `sha256` char(64),           # sha256 hash
>        `fuzzy_hash` char(96),       # fuzzy hash
>        `fuzzy_match` char(96),      # match in fuzzy hash database
>        `entropy` float,             # entropy
>        `zoo_date` datetime,         # date entered in malware zoo
>        `zoo_name` varchar(255),     # zoo file name of sample (most often hash)
>        `comment` varchar(255),      # user comments on sample
>        `do_not_dup` int,            # do not duplicate beyond this zoo
>        UNIQUE (`sha1`),
>        PRIMARY KEY (`md5`)
>        )"""
>
>
> A table entry describing malware metadata. This could be part of the
> malware sample entry itself.
>
>        sources_table_sql = """CREATE TABLE sources
>        (
>        `md5` char(32) NOT NULL,
>        `original_name` char,       # original name of the file
>        `tag` varchar(64),          # Free form, user tag
>        `source` varchar(64),       # Source of sample - email, web site, ....
>        `location` varchar(256),    # Full path, IP address, system, ....
>        `collection_date` datetime, # Date collected
>        `case_str` varchar(24),     # Engagement/case id
>        `case_id` int(8),           # Pointer to case record (future use)
>        `comment` varchar(256),
>        PRIMARY KEY (`md5`),
>        FOREIGN KEY (`md5`) REFERENCES malware(`md5`)
>        )"""
>
> -David
>
> On Thu, Aug 23, 2012 at 10:22 AM, Kirillov, Ivan A. <[hidden email]> wrote:
>> David,
>>
>> Similarly to what I said in response to Jose, if you can send me a listing or example of the data elements that you capture, I can provide you with a mapping or example of its representation in MAEC/CybOX.
>>
>> MAEC & CybOX are more expressive and a superset of OpenIOC, so they should be able to capture a much wider variety of information.
>>
>> Also, as Darien said we're definitely looking for input as far as static-analysis specific information (e.g. entropy, function hashing, etc.) that we should capture in MAEC.
>>
>> Regards,
>> Ivan
>>
>> Ivan Kirillov
>> MAEC Project
>> The MITRE Corporation
>>
>> -----Original Message-----
>> From: David Kovar [mailto:[hidden email]]
>> Sent: Thursday, August 23, 2012 11:14 AM
>> To: Darien Kindlund
>> Cc: Jose Nazario; Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
>> Subject: Re: seeking a schema to disclose static analysis results
>>
>> Greetings,
>>
>> I am working on a similar effort and am stuffing the results into a
>> database. I'd love to find a good interchange format and was looking
>> at OpenIOC but am not wedded to it yet.
>>
>> -David
>>
>>
>> On Thu, Aug 23, 2012 at 10:03 AM, Darien Kindlund
>> <[hidden email]> wrote:
>>> Hi Jose,
>>>
>>> My understanding is that the MAEC schema is designed to provide output from dynamic analysis environments -- not from static analysis, as a separate schema exists to capture that information.  Ivan, can you let Jose know which schema would be appropriate for capturing his static analysis information?  I believe improvements to the static analysis schema are still accepted, so there's a possibility for your elements to be included, if they don't already exist, accordingly.
>>>
>>> Regards,
>>> -- Darien
>>>
>>> --
>>> Darien Kindlund
>>> Sr. Staff Scientist
>>> Direct: +1 (703) 608-8749 | Fax: +1 (408) 321-9818
>>> Email: [hidden email]
>>>
>>> FireEye, Inc.
>>> Next Generation Threat Protection
>>> 1420 Beverly Rd. #150, McLean, VA 22101
>>> http://www.FireEye.com
>>>
>>>
>>>> -----Original Message-----
>>>> From: [hidden email] [mailto:owner-maec-
>>>> [hidden email]] On Behalf Of Jose Nazario
>>>> Sent: Thursday, August 23, 2012 10:54 AM
>>>> To: maec-discussion-list Malware Attribute Enumeration Discussion
>>>> Subject: seeking a schema to disclose static analysis results
>>>>
>>>> some years ago for work i wrote an auto static analyzer, which takes memory
>>>> dumps and extracts out the useful tidbits. it's a significant time saver at
>>>> best, at worst a 2 minute detour.
>>>>
>>>> the tool's output is flat text files right now, and i'd like to integrate it
>>>> better into our infrastructure by using a structured schema. i'd also like
>>>> to not loose any info, and i know that some of the schemas out there would
>>>> drop useful data on the floor.
>>>>
>>>> the info captured and reported includes:
>>>>
>>>> - file hashes (md5, sha1, sha256, etc, plus fuzzy hashes)
>>>> - AV detection
>>>> - file type
>>>> - file name
>>>> - file size
>>>> - packer(s) seen
>>>> - RAR contents, CAB contents, ZIP contents (as appropriate)
>>>> - sigcheck output
>>>> - environment/vmm/sandbox detection indicators
>>>> - embedded urls
>>>> - embedded url elements
>>>> - embedded filenames
>>>> - embedded registry keys
>>>> - base64 decode strings
>>>> - embedded IPs
>>>> - embedded hostnames
>>>> - embedded email addresses
>>>> - candidate behaviors (e.g. HTTP client, keystroke logger, etc)
>>>> - PE info
>>>> - imports and APIs called
>>>>
>>>> some of these appear to be covered by things like CyBox, IEEE malware
>>>> metadata, MITRE MAEC, etc, but not all. like i said i'm not willing to drop
>>>> any of the above on the floor.
>>>>
>>>> any suggestions welcome, otherwise i'll get going on my own schema.
>>>>
>>>> _____________________________
>>>> jose nazario, ph.d. [hidden email]
>>>> sr. manager of security research, arbor networks
>>>> blog:    http://asert.arbor.net/
>>>> twitter: @arbornetworks
>>>
>>> ______________________________________________________________________
>>> This email and any attachments thereto may contain private, confidential, and/or privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
>>> ______________________________________________________________________

_____________________________
jose nazario, ph.d. [hidden email]
sr. manager of security research, arbor networks
blog:    http://asert.arbor.net/
twitter: @arbornetworks