Quantcast

PE Static Analysis Attributes?

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

PE Static Analysis Attributes?

Kirillov, Ivan A.
Hello Everyone,

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

-Strings
-Linker/Version
-Size of Code
-Size of Initialized Data
-Size of Uninitialized Data
-Imports
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-Section Info
        *Name
        *Virtual Size
        *Virtual Address
-Resources

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.
       
Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Mayank.2.Bhatnagar

Hi,

 

We can add the following as additional information required, which are useful to analyse Windows PE binaries.

 

1.       DLL characteristics/DLL Count

2.       Size of Image

3.       Number of Sections

 

Thanks & Regards,

Mayank

 

 

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Wednesday, November 10, 2010 1:45 AM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

 

Hello Everyone,

 

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

 

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

 

-Strings

-Linker/Version

-Size of Code

-Size of Initialized Data

-Size of Uninitialized Data

-Imports

-Exports

-Entry Point Address

-Base Code Address

-Base Data Address

-Section Info

                *Name

                *Virtual Size

                *Virtual Address

-Resources

 

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.

               

Regards,

Ivan

 

Ivan Kirillov

MAEC Working Group

The MITRE Corporation

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PE Static Analysis Attributes?

Lippenholz, Scot [USA]
In reply to this post by Kirillov, Ivan A.
One more.  The area of persistence of know.  How does it start up.

Thanks,
Scot

Poorly spelled from my mobile device.

----- Reply message -----
From: "Mayank.2.Bhatnagar" <[hidden email]>
Date: Wed, Nov 10, 2010 7:14 am
Subject: PE Static Analysis Attributes?
To: "Kirillov, Ivan A." <[hidden email]>, "maec-discussion-list Malware Attribute Enumeration Discussion" <[hidden email]>

Hi,

 

We can add the following as additional information required, which are useful to analyse Windows PE binaries.

 

1.       DLL characteristics/DLL Count

2.       Size of Image

3.       Number of Sections

 

Thanks & Regards,

Mayank

 

 

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Wednesday, November 10, 2010 1:45 AM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

 

Hello Everyone,

 

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

 

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

 

-Strings

-Linker/Version

-Size of Code

-Size of Initialized Data

-Size of Uninitialized Data

-Imports

-Exports

-Entry Point Address

-Base Code Address

-Base Data Address

-Section Info

                *Name

                *Virtual Size

                *Virtual Address

-Resources

 

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.

               

Regards,

Ivan

 

Ivan Kirillov

MAEC Working Group

The MITRE Corporation

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PE Static Analysis Attributes?

Rajesh Nikam
Hi,

Could we also add

 - Section Entropy
 - Packer ID - UPX, ASPack, VB, Delphi etc
 - Installer ID - NSIS, Wise, WinRar SFX etc
 - File Size
 - Appended Data size
 - Version Information - File/Product Version, Company Name
 - Digital Cerificate Info -if present, valid, issuer

Adding strings needs to be done with care, how strings are extreacted ?- ( using strings.exe tool from Sysinternals or somethings else) as in some cases there are lots of strings in file.


Thanks
Rajesh

On Wed, Nov 10, 2010 at 6:51 PM, Lippenholz, Scot [USA] <[hidden email]> wrote:
One more.  The area of persistence of know.  How does it start up.

Thanks,
Scot

Poorly spelled from my mobile device.

----- Reply message -----
From: "Mayank.2.Bhatnagar" <[hidden email]>
Date: Wed, Nov 10, 2010 7:14 am

Subject: PE Static Analysis Attributes?
To: "Kirillov, Ivan A." <[hidden email]>, "maec-discussion-list Malware Attribute Enumeration Discussion" <[hidden email]>

Hi,

 

We can add the following as additional information required, which are useful to analyse Windows PE binaries.

 

1.       DLL characteristics/DLL Count

2.       Size of Image

3.       Number of Sections

 

Thanks & Regards,

Mayank

 

 

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Wednesday, November 10, 2010 1:45 AM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

 

Hello Everyone,

 

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

 

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

 

-Strings

-Linker/Version

-Size of Code

-Size of Initialized Data

-Size of Uninitialized Data

-Imports

-Exports

-Entry Point Address

-Base Code Address

-Base Data Address

-Section Info

                *Name

                *Virtual Size

                *Virtual Address

-Resources

 

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.

               

Regards,

Ivan

 

Ivan Kirillov

MAEC Working Group

The MITRE Corporation


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Chase, Melissa P.
In reply to this post by Mayank.2.Bhatnagar

Hi Mayank,

 

Could you elaborate on #1? What DLL characteristics would you want to see included?

 

Thanks,

 

Penny

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Mayank.2.Bhatnagar
Sent: Wednesday, November 10, 2010 7:26 AM
To: Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

 

Hi,

 

We can add the following as additional information required, which are useful to analyse Windows PE binaries.

 

1.       DLL characteristics/DLL Count

2.       Size of Image

3.       Number of Sections

 

Thanks & Regards,

Mayank

 

 

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Wednesday, November 10, 2010 1:45 AM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

 

Hello Everyone,

 

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

 

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

 

-Strings

-Linker/Version

-Size of Code

-Size of Initialized Data

-Size of Uninitialized Data

-Imports

-Exports

-Entry Point Address

-Base Code Address

-Base Data Address

-Section Info

                *Name

                *Virtual Size

                *Virtual Address

-Resources

 

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.

               

Regards,

Ivan

 

Ivan Kirillov

MAEC Working Group

The MITRE Corporation

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: PE Static Analysis Attributes?

Meyers, Adam
I would think exports would be a good place to start.

Adam
Adam Meyers
Director, Cyber Security Intelligence
Cyber Security Division
SRA International

cell: 903.231.3371
secure (ovv): 5712967631
http://www.sra.com

PGP Fingerprint: 6476 C089 9EB6 C076 ADCF 1102 5097 97C9 EE21 49E5

 
From: Chase, Melissa P. [mailto:[hidden email]]
Sent: Wednesday, November 10, 2010 09:49 AM
To: Mayank.2.Bhatnagar <[hidden email]>; Kirillov, Ivan A. <[hidden email]>; maec-discussion-list Malware Attribute Enumeration Discussion <[hidden email]>
Subject: RE: PE Static Analysis Attributes?
 

Hi Mayank,

 

Could you elaborate on #1? What DLL characteristics would you want to see included?

 

Thanks,

 

Penny

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Mayank.2.Bhatnagar
Sent: Wednesday, November 10, 2010 7:26 AM
To: Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

 

Hi,

 

We can add the following as additional information required, which are useful to analyse Windows PE binaries.

 

1.       DLL characteristics/DLL Count

2.       Size of Image

3.       Number of Sections

 

Thanks & Regards,

Mayank

 

 

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Wednesday, November 10, 2010 1:45 AM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

 

Hello Everyone,

 

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

 

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

 

-Strings

-Linker/Version

-Size of Code

-Size of Initialized Data

-Size of Uninitialized Data

-Imports

-Exports

-Entry Point Address

-Base Code Address

-Base Data Address

-Section Info

                *Name

                *Virtual Size

                *Virtual Address

-Resources

 

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.

               

Regards,

Ivan

 

Ivan Kirillov

MAEC Working Group

The MITRE Corporation

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

McCarl, Michael
In reply to this post by Kirillov, Ivan A.
because the pe structure is well documented, i think it's safe to
include all the values from the dos header, file header, optional header
and section headers.  additionally, hash values of each of these headers
would be necessary. (btw, this would include dll characteristics as
Mayank suggests)

additionally, hash values for strings and resources would be useful. (we
might want to specify that a "string" is of a specific minimum length to
eliminate strings of 1 or 2 character).  also, we would need to know if
the string is a unicode or ansi string.

i think that information about the packer (if packed) is ok, but i think
it's not as useful as it once was (mainly because of the number of
packers).

one item that isn't just a pe file attribute is the various names
(aliases) the file has been known to use.  

with respect to hash values, i'd like to comment that while there are
lots of hash values we could use, most aren't very useful unless you're
searching for other files that have the exact same hash value.  if
"context triggered piecewise hashing" (jesse kornblum,
http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used, files
(or portions thereof) could be compared for similarity instead of just
simple exact matches.

regarding imports, would this be the "initially visible" imports?
packers usually hide the imports of a file and often show only the
minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
also, imports need to be distinguished as delay-load or not.  perhaps it
would be useful to include the static data of the unpacked form of the
file as well (or at least a reference to MAEC data about the unpacked
form of the file).

i believe Scot suggested information about how it starts up...  i can
think of a couple interpretations of this: first, does this mean how the
program is launched (e.g. click on it in explorer, an existing
vulnerability, etc.) or; second, the command line parameters it
uses/requires and other environment settings to make it function.  i'm
not sure either comes from static analysis, but it might.

btw: where/when is the next meeting to discuss MAEC?

mike mccarl

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of
Kirillov, Ivan A.
Sent: Tuesday, November 09, 2010 3:15 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

Hello Everyone,

One of the things we're planning on adding to the next MAEC schema
release is the capability of characterizing specific Windows PE binary
attributes obtained through static analysis.

We've taken a stab at compiling an initial list of these, but we'd
appreciate your input on anything else that you think needs to be added
(or removed!). Here's the list:

-Strings
-Linker/Version
-Size of Code
-Size of Initialized Data
-Size of Uninitialized Data
-Imports
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-Section Info
        *Name
        *Virtual Size
        *Virtual Address
-Resources

Also, we hope to post to the MAEC website a tracker for MAEC-schema
related development issues, providing you with more insight into planned
improvements/revisions, in the next few weeks. I'll keep you posted.
       
Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Mayank.2.Bhatnagar
Hi Michael,

You are right about the packer thing. There are many, yet we can still identify at least one which we can get to know what kind of packing has been done. For any multi-compressed malware, or if there is a trend like that, we may be able to identify it and attribute it.

Hi Penny,

By DLL Chars, I was mainly interested in knowing whether it is a DLL/EXE or anything else.

Thanks & Regards,
Mayank

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of McCarl, Michael
Sent: Wednesday, November 10, 2010 9:57 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

because the pe structure is well documented, i think it's safe to include all the values from the dos header, file header, optional header and section headers.  additionally, hash values of each of these headers would be necessary. (btw, this would include dll characteristics as Mayank suggests)

additionally, hash values for strings and resources would be useful. (we might want to specify that a "string" is of a specific minimum length to eliminate strings of 1 or 2 character).  also, we would need to know if the string is a unicode or ansi string.

i think that information about the packer (if packed) is ok, but i think it's not as useful as it once was (mainly because of the number of packers).

one item that isn't just a pe file attribute is the various names
(aliases) the file has been known to use.  

with respect to hash values, i'd like to comment that while there are lots of hash values we could use, most aren't very useful unless you're searching for other files that have the exact same hash value.  if "context triggered piecewise hashing" (jesse kornblum,
http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used, files (or portions thereof) could be compared for similarity instead of just simple exact matches.

regarding imports, would this be the "initially visible" imports?
packers usually hide the imports of a file and often show only the minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
also, imports need to be distinguished as delay-load or not.  perhaps it would be useful to include the static data of the unpacked form of the file as well (or at least a reference to MAEC data about the unpacked form of the file).

i believe Scot suggested information about how it starts up...  i can think of a couple interpretations of this: first, does this mean how the program is launched (e.g. click on it in explorer, an existing vulnerability, etc.) or; second, the command line parameters it uses/requires and other environment settings to make it function.  i'm not sure either comes from static analysis, but it might.

btw: where/when is the next meeting to discuss MAEC?

mike mccarl


===================================================================================
From: Chase, Melissa P. [mailto:[hidden email]]
Sent: Wednesday, November 10, 2010 8:20 PM
To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Mayank,

Could you elaborate on #1? What DLL characteristics would you want to see included?

Thanks,

Penny

===================================================================================

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Tuesday, November 09, 2010 3:15 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

Hello Everyone,

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

-Strings
-Linker/Version
-Size of Code
-Size of Initialized Data
-Size of Uninitialized Data
-Imports
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-Section Info
        *Name
        *Virtual Size
        *Virtual Address
-Resources

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.
       
Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Kirillov, Ivan A.
Thanks for the input everyone. Here's the revised list:

-DLL Count
-PE Type
        *Dll/EXE/Other?
-Strings (minimum length = 3?)
        *Hash value(s)
        *Encoding (Unicode/ANSI)
-Headers
        *DOS Header
                *Hash value(s)
        *File Header
                *Hash value(s)
        *Optional Header
                *Hash value(s)
        *Section Header
                *Hash value(s)
-Linker/Version
-Size of Code
-Size of Image
-Size of Initialized Data
-Size of Uninitialized Data
-Size of Appended Data
-Version Info
        *File Version
        *Product Version
        *Company Name
-Digital Certificate Info
        *Validity
        *Issuer
-Imports
        *Delay-load (yes/no)
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-# of Sections
-Section Info
        *Name
        *Entropy
        *Virtual Size
        *Virtual Address
-Resources
        *Hash value(s)

As far as packers, we already have the following mechanism for their characterization, under File_System_Object_Attributes of the ObjectType - see http://maec.mitre.org/language/version1.01/xsddocs/MAEC/complexType/ObjectType.File_System_Object_Attributes.Packing.html

Perhaps we should implement something similar for installers?

I agree that we should be careful regarding strings, since there are bound to be a large number for most binaries. I think having a minimum length for inclusion makes sense, as Mike said. It's also worth considering how to best structure such data.

Mike, do you think it's worth having both initially visible imports as well as those visible after unpack?

Scot, are you referring to persistence with regards to malware startup (e.g. autorun registry key)? Or as Mike said, the initial launch conditions?

Mayank, does the PE type category I added above cover the DLL/EXE attribute you referred to?

Have a good weekend!

Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Mayank.2.Bhatnagar
Sent: Friday, November 12, 2010 9:26 AM
To: McCarl, Michael; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Michael,

You are right about the packer thing. There are many, yet we can still identify at least one which we can get to know what kind of packing has been done. For any multi-compressed malware, or if there is a trend like that, we may be able to identify it and attribute it.

Hi Penny,

By DLL Chars, I was mainly interested in knowing whether it is a DLL/EXE or anything else.

Thanks & Regards,
Mayank

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of McCarl, Michael
Sent: Wednesday, November 10, 2010 9:57 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

because the pe structure is well documented, i think it's safe to include all the values from the dos header, file header, optional header and section headers.  additionally, hash values of each of these headers would be necessary. (btw, this would include dll characteristics as Mayank suggests)

additionally, hash values for strings and resources would be useful. (we might want to specify that a "string" is of a specific minimum length to eliminate strings of 1 or 2 character).  also, we would need to know if the string is a unicode or ansi string.

i think that information about the packer (if packed) is ok, but i think it's not as useful as it once was (mainly because of the number of packers).

one item that isn't just a pe file attribute is the various names
(aliases) the file has been known to use.  

with respect to hash values, i'd like to comment that while there are lots of hash values we could use, most aren't very useful unless you're searching for other files that have the exact same hash value.  if "context triggered piecewise hashing" (jesse kornblum,
http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used, files (or portions thereof) could be compared for similarity instead of just simple exact matches.

regarding imports, would this be the "initially visible" imports?
packers usually hide the imports of a file and often show only the minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
also, imports need to be distinguished as delay-load or not.  perhaps it would be useful to include the static data of the unpacked form of the file as well (or at least a reference to MAEC data about the unpacked form of the file).

i believe Scot suggested information about how it starts up...  i can think of a couple interpretations of this: first, does this mean how the program is launched (e.g. click on it in explorer, an existing vulnerability, etc.) or; second, the command line parameters it uses/requires and other environment settings to make it function.  i'm not sure either comes from static analysis, but it might.

btw: where/when is the next meeting to discuss MAEC?

mike mccarl


===================================================================================
From: Chase, Melissa P. [mailto:[hidden email]]
Sent: Wednesday, November 10, 2010 8:20 PM
To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Mayank,

Could you elaborate on #1? What DLL characteristics would you want to see included?

Thanks,

Penny

===================================================================================

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Tuesday, November 09, 2010 3:15 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

Hello Everyone,

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

-Strings
-Linker/Version
-Size of Code
-Size of Initialized Data
-Size of Uninitialized Data
-Imports
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-Section Info
        *Name
        *Virtual Size
        *Virtual Address
-Resources

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.
       
Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Mayank.2.Bhatnagar
Yes Ivan,
Thanks....and this list looks exhaustive and complete....

Thanks & Regards,
Mayank

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Saturday, November 13, 2010 3:06 AM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Thanks for the input everyone. Here's the revised list:

-DLL Count
-PE Type
        *Dll/EXE/Other?
-Strings (minimum length = 3?)
        *Hash value(s)
        *Encoding (Unicode/ANSI)
-Headers
        *DOS Header
                *Hash value(s)
        *File Header
                *Hash value(s)
        *Optional Header
                *Hash value(s)
        *Section Header
                *Hash value(s)
-Linker/Version
-Size of Code
-Size of Image
-Size of Initialized Data
-Size of Uninitialized Data
-Size of Appended Data
-Version Info
        *File Version
        *Product Version
        *Company Name
-Digital Certificate Info
        *Validity
        *Issuer
-Imports
        *Delay-load (yes/no)
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-# of Sections
-Section Info
        *Name
        *Entropy
        *Virtual Size
        *Virtual Address
-Resources
        *Hash value(s)

As far as packers, we already have the following mechanism for their characterization, under File_System_Object_Attributes of the ObjectType - see http://maec.mitre.org/language/version1.01/xsddocs/MAEC/complexType/ObjectType.File_System_Object_Attributes.Packing.html

Perhaps we should implement something similar for installers?

I agree that we should be careful regarding strings, since there are bound to be a large number for most binaries. I think having a minimum length for inclusion makes sense, as Mike said. It's also worth considering how to best structure such data.

Mike, do you think it's worth having both initially visible imports as well as those visible after unpack?

Scot, are you referring to persistence with regards to malware startup (e.g. autorun registry key)? Or as Mike said, the initial launch conditions?

Mayank, does the PE type category I added above cover the DLL/EXE attribute you referred to?

Have a good weekend!

Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Mayank.2.Bhatnagar
Sent: Friday, November 12, 2010 9:26 AM
To: McCarl, Michael; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Michael,

You are right about the packer thing. There are many, yet we can still identify at least one which we can get to know what kind of packing has been done. For any multi-compressed malware, or if there is a trend like that, we may be able to identify it and attribute it.

Hi Penny,

By DLL Chars, I was mainly interested in knowing whether it is a DLL/EXE or anything else.

Thanks & Regards,
Mayank

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of McCarl, Michael
Sent: Wednesday, November 10, 2010 9:57 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

because the pe structure is well documented, i think it's safe to include all the values from the dos header, file header, optional header and section headers.  additionally, hash values of each of these headers would be necessary. (btw, this would include dll characteristics as Mayank suggests)

additionally, hash values for strings and resources would be useful. (we might want to specify that a "string" is of a specific minimum length to eliminate strings of 1 or 2 character).  also, we would need to know if the string is a unicode or ansi string.

i think that information about the packer (if packed) is ok, but i think it's not as useful as it once was (mainly because of the number of packers).

one item that isn't just a pe file attribute is the various names
(aliases) the file has been known to use.  

with respect to hash values, i'd like to comment that while there are lots of hash values we could use, most aren't very useful unless you're searching for other files that have the exact same hash value.  if "context triggered piecewise hashing" (jesse kornblum,
http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used, files (or portions thereof) could be compared for similarity instead of just simple exact matches.

regarding imports, would this be the "initially visible" imports?
packers usually hide the imports of a file and often show only the minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
also, imports need to be distinguished as delay-load or not.  perhaps it would be useful to include the static data of the unpacked form of the file as well (or at least a reference to MAEC data about the unpacked form of the file).

i believe Scot suggested information about how it starts up...  i can think of a couple interpretations of this: first, does this mean how the program is launched (e.g. click on it in explorer, an existing vulnerability, etc.) or; second, the command line parameters it uses/requires and other environment settings to make it function.  i'm not sure either comes from static analysis, but it might.

btw: where/when is the next meeting to discuss MAEC?

mike mccarl


===================================================================================
From: Chase, Melissa P. [mailto:[hidden email]]
Sent: Wednesday, November 10, 2010 8:20 PM
To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Mayank,

Could you elaborate on #1? What DLL characteristics would you want to see included?

Thanks,

Penny

===================================================================================

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Tuesday, November 09, 2010 3:15 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

Hello Everyone,

One of the things we're planning on adding to the next MAEC schema release is the capability of characterizing specific Windows PE binary attributes obtained through static analysis.

We've taken a stab at compiling an initial list of these, but we'd appreciate your input on anything else that you think needs to be added (or removed!). Here's the list:

-Strings
-Linker/Version
-Size of Code
-Size of Initialized Data
-Size of Uninitialized Data
-Imports
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-Section Info
        *Name
        *Virtual Size
        *Virtual Address
-Resources

Also, we hope to post to the MAEC website a tracker for MAEC-schema related development issues, providing you with more insight into planned improvements/revisions, in the next few weeks. I'll keep you posted.
       
Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

McCarl, Michael
In reply to this post by Kirillov, Ivan A.
with respect to sections, there will be 2 hash values: the hash of the
header and the hash of the section data it contains (except for
uninitialized data sections). adding "Hash" under "Section Info" should
cover this.

i'd like to clarify the types of imports:  

    1. the "basic" imports commonly known as simply "imports".  these
are references
       made to external libraries that are resolved at load time by the
system loader.

    2. "delay-load" imports. like basic imports, these are references
made to external
       libraries, but are not resolved at load time by the system
loader. they are resolved
       at run time when needed.  like the basic imports, there is a
separate entry in the
       pe optional header for delay-load imports.

    3. "initially visible" imports.  this is a term i made up.  often
when a pe file is packed,
       the import tables are also included in the packer's compression
mechanism making them
       inaccessible to the loader.  to get the program to load/run, a
new import table (usually
       a minimal set of imports) is created consisting of the set of
external functions the
       unpacker requires to decompress the packed code. the unpack
routine can then decompress
       the original import tables and load them at run time. presumably,
the unpack routine could
       also initialize the delay-load import tables so they function as
a normal delay-load import.

combining all these categories yields the following:

    initially visible imports
    initially visible delay-load imports
    initially hidden (packed) imports
    initially hidden (packed) delay-load imports


-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of
Kirillov, Ivan A.
Sent: Friday, November 12, 2010 4:36 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Thanks for the input everyone. Here's the revised list:

-DLL Count
-PE Type
        *Dll/EXE/Other?
-Strings (minimum length = 3?)
        *Hash value(s)
        *Encoding (Unicode/ANSI)
-Headers
        *DOS Header
                *Hash value(s)
        *File Header
                *Hash value(s)
        *Optional Header
                *Hash value(s)
        *Section Header
                *Hash value(s)
-Linker/Version
-Size of Code
-Size of Image
-Size of Initialized Data
-Size of Uninitialized Data
-Size of Appended Data
-Version Info
        *File Version
        *Product Version
        *Company Name
-Digital Certificate Info
        *Validity
        *Issuer
-Imports
        *Delay-load (yes/no)
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-# of Sections
-Section Info
        *Name
        *Entropy
        *Virtual Size
        *Virtual Address
-Resources
        *Hash value(s)

As far as packers, we already have the following mechanism for their
characterization, under File_System_Object_Attributes of the ObjectType
- see
http://maec.mitre.org/language/version1.01/xsddocs/MAEC/complexType/Obje
ctType.File_System_Object_Attributes.Packing.html

Perhaps we should implement something similar for installers?

I agree that we should be careful regarding strings, since there are
bound to be a large number for most binaries. I think having a minimum
length for inclusion makes sense, as Mike said. It's also worth
considering how to best structure such data.

Mike, do you think it's worth having both initially visible imports as
well as those visible after unpack?

Scot, are you referring to persistence with regards to malware startup
(e.g. autorun registry key)? Or as Mike said, the initial launch
conditions?

Mayank, does the PE type category I added above cover the DLL/EXE
attribute you referred to?

Have a good weekend!

Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of
Mayank.2.Bhatnagar
Sent: Friday, November 12, 2010 9:26 AM
To: McCarl, Michael; maec-discussion-list Malware Attribute Enumeration
Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Michael,

You are right about the packer thing. There are many, yet we can still
identify at least one which we can get to know what kind of packing has
been done. For any multi-compressed malware, or if there is a trend like
that, we may be able to identify it and attribute it.

Hi Penny,

By DLL Chars, I was mainly interested in knowing whether it is a DLL/EXE
or anything else.

Thanks & Regards,
Mayank

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of McCarl,
Michael
Sent: Wednesday, November 10, 2010 9:57 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

because the pe structure is well documented, i think it's safe to
include all the values from the dos header, file header, optional header
and section headers.  additionally, hash values of each of these headers
would be necessary. (btw, this would include dll characteristics as
Mayank suggests)

additionally, hash values for strings and resources would be useful. (we
might want to specify that a "string" is of a specific minimum length to
eliminate strings of 1 or 2 character).  also, we would need to know if
the string is a unicode or ansi string.

i think that information about the packer (if packed) is ok, but i think
it's not as useful as it once was (mainly because of the number of
packers).

one item that isn't just a pe file attribute is the various names
(aliases) the file has been known to use.  

with respect to hash values, i'd like to comment that while there are
lots of hash values we could use, most aren't very useful unless you're
searching for other files that have the exact same hash value.  if
"context triggered piecewise hashing" (jesse kornblum,
http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used, files
(or portions thereof) could be compared for similarity instead of just
simple exact matches.

regarding imports, would this be the "initially visible" imports?
packers usually hide the imports of a file and often show only the
minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
also, imports need to be distinguished as delay-load or not.  perhaps it
would be useful to include the static data of the unpacked form of the
file as well (or at least a reference to MAEC data about the unpacked
form of the file).

i believe Scot suggested information about how it starts up...  i can
think of a couple interpretations of this: first, does this mean how the
program is launched (e.g. click on it in explorer, an existing
vulnerability, etc.) or; second, the command line parameters it
uses/requires and other environment settings to make it function.  i'm
not sure either comes from static analysis, but it might.

btw: where/when is the next meeting to discuss MAEC?

mike mccarl


========================================================================
===========
From: Chase, Melissa P. [mailto:[hidden email]]
Sent: Wednesday, November 10, 2010 8:20 PM
To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list Malware
Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

Hi Mayank,

Could you elaborate on #1? What DLL characteristics would you want to
see included?

Thanks,

Penny

========================================================================
===========

-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of
Kirillov, Ivan A.
Sent: Tuesday, November 09, 2010 3:15 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: PE Static Analysis Attributes?

Hello Everyone,

One of the things we're planning on adding to the next MAEC schema
release is the capability of characterizing specific Windows PE binary
attributes obtained through static analysis.

We've taken a stab at compiling an initial list of these, but we'd
appreciate your input on anything else that you think needs to be added
(or removed!). Here's the list:

-Strings
-Linker/Version
-Size of Code
-Size of Initialized Data
-Size of Uninitialized Data
-Imports
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-Section Info
        *Name
        *Virtual Size
        *Virtual Address
-Resources

Also, we hope to post to the MAEC website a tracker for MAEC-schema
related development issues, providing you with more insight into planned
improvements/revisions, in the next few weeks. I'll keep you posted.
       
Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: EXTERNAL: RE: PE Static Analysis Attributes?

McCarl, Michael
i don't think this moves us toward packed vs. unpacked descriptions
(although, that is something to think about).  all it suggests is that
attributes for import elements support several different types of
imports.

a possible scenario might be to define an "importtype" enumeration that
identifies the different types and allow it to be included in an import
definition.  an example of such an element might be:

<Import fileName="msvcrt.dll" importtype="delay-load"
ordinal="0">_getdrives</Import>


-----Original Message-----
From: Dan Waitman [mailto:[hidden email]]
Sent: Monday, November 15, 2010 8:50 AM
To: McCarl, Michael
Cc: Kirillov, Ivan A.; maec-discussion-list Malware Attribute
Enumeration Discussion
Subject: Re: EXTERNAL: RE: PE Static Analysis Attributes?

Now it sounds like we're getting to the point where we need to be able
to define a packed PE description and an unpacked PE description.
... and possibly even a series of unpacked PE's for the various stages
of unpacking given packers that decrypt a function at a time.

On 11/15/2010 8:11 AM, McCarl, Michael wrote:
> with respect to sections, there will be 2 hash values: the hash of the

> header and the hash of the section data it contains (except for
> uninitialized data sections). adding "Hash" under "Section Info"
> should cover this.
>
> i'd like to clarify the types of imports:  
>
>     1. the "basic" imports commonly known as simply "imports".  these
> are references
>        made to external libraries that are resolved at load time by
> the system loader.
>
>     2. "delay-load" imports. like basic imports, these are references
> made to external
>        libraries, but are not resolved at load time by the system
> loader. they are resolved
>        at run time when needed.  like the basic imports, there is a
> separate entry in the
>        pe optional header for delay-load imports.
>
>     3. "initially visible" imports.  this is a term i made up.  often
> when a pe file is packed,
>        the import tables are also included in the packer's compression

> mechanism making them
>        inaccessible to the loader.  to get the program to load/run, a
> new import table (usually
>        a minimal set of imports) is created consisting of the set of
> external functions the
>        unpacker requires to decompress the packed code. the unpack
> routine can then decompress
>        the original import tables and load them at run time.
> presumably, the unpack routine could
>        also initialize the delay-load import tables so they function
> as a normal delay-load import.
>
> combining all these categories yields the following:
>
>     initially visible imports
>     initially visible delay-load imports
>     initially hidden (packed) imports
>     initially hidden (packed) delay-load imports
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Kirillov, Ivan A.
> Sent: Friday, November 12, 2010 4:36 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Thanks for the input everyone. Here's the revised list:
>
> -DLL Count
> -PE Type
> *Dll/EXE/Other?
> -Strings (minimum length = 3?)
> *Hash value(s)
> *Encoding (Unicode/ANSI)
> -Headers
> *DOS Header
> *Hash value(s)
> *File Header
> *Hash value(s)
> *Optional Header
> *Hash value(s)
> *Section Header
> *Hash value(s)
> -Linker/Version
> -Size of Code
> -Size of Image
> -Size of Initialized Data
> -Size of Uninitialized Data
> -Size of Appended Data
> -Version Info
> *File Version
> *Product Version
> *Company Name
> -Digital Certificate Info
> *Validity
> *Issuer
> -Imports
> *Delay-load (yes/no)
> -Exports
> -Entry Point Address
> -Base Code Address
> -Base Data Address
> -# of Sections
> -Section Info
> *Name
> *Entropy
> *Virtual Size
> *Virtual Address
> -Resources
> *Hash value(s)
>
> As far as packers, we already have the following mechanism for their
> characterization, under File_System_Object_Attributes of the
> ObjectType
> - see
> http://maec.mitre.org/language/version1.01/xsddocs/MAEC/complexType/Ob
> je ctType.File_System_Object_Attributes.Packing.html
>
> Perhaps we should implement something similar for installers?
>
> I agree that we should be careful regarding strings, since there are
> bound to be a large number for most binaries. I think having a minimum

> length for inclusion makes sense, as Mike said. It's also worth
> considering how to best structure such data.
>
> Mike, do you think it's worth having both initially visible imports as

> well as those visible after unpack?
>
> Scot, are you referring to persistence with regards to malware startup

> (e.g. autorun registry key)? Or as Mike said, the initial launch
> conditions?
>
> Mayank, does the PE type category I added above cover the DLL/EXE
> attribute you referred to?
>
> Have a good weekend!
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Working Group
> The MITRE Corporation
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Mayank.2.Bhatnagar
> Sent: Friday, November 12, 2010 9:26 AM
> To: McCarl, Michael; maec-discussion-list Malware Attribute
> Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Hi Michael,
>
> You are right about the packer thing. There are many, yet we can still

> identify at least one which we can get to know what kind of packing
> has been done. For any multi-compressed malware, or if there is a
> trend like that, we may be able to identify it and attribute it.
>
> Hi Penny,
>
> By DLL Chars, I was mainly interested in knowing whether it is a
> DLL/EXE or anything else.
>
> Thanks & Regards,
> Mayank
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> McCarl, Michael
> Sent: Wednesday, November 10, 2010 9:57 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> because the pe structure is well documented, i think it's safe to
> include all the values from the dos header, file header, optional
> header and section headers.  additionally, hash values of each of
> these headers would be necessary. (btw, this would include dll
> characteristics as Mayank suggests)
>
> additionally, hash values for strings and resources would be useful.
> (we might want to specify that a "string" is of a specific minimum
> length to eliminate strings of 1 or 2 character).  also, we would need

> to know if the string is a unicode or ansi string.
>
> i think that information about the packer (if packed) is ok, but i
> think it's not as useful as it once was (mainly because of the number
> of packers).
>
> one item that isn't just a pe file attribute is the various names
> (aliases) the file has been known to use.  
>
> with respect to hash values, i'd like to comment that while there are
> lots of hash values we could use, most aren't very useful unless
> you're searching for other files that have the exact same hash value.

> if "context triggered piecewise hashing" (jesse kornblum,
> http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used,
> files (or portions thereof) could be compared for similarity instead
> of just simple exact matches.
>
> regarding imports, would this be the "initially visible" imports?
> packers usually hide the imports of a file and often show only the
> minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
> also, imports need to be distinguished as delay-load or not.  perhaps
> it would be useful to include the static data of the unpacked form of
> the file as well (or at least a reference to MAEC data about the
> unpacked form of the file).
>
> i believe Scot suggested information about how it starts up...  i can
> think of a couple interpretations of this: first, does this mean how
> the program is launched (e.g. click on it in explorer, an existing
> vulnerability, etc.) or; second, the command line parameters it
> uses/requires and other environment settings to make it function.  i'm

> not sure either comes from static analysis, but it might.
>
> btw: where/when is the next meeting to discuss MAEC?
>
> mike mccarl
>
>
> ======================================================================
> ==
> ===========
> From: Chase, Melissa P. [mailto:[hidden email]]
> Sent: Wednesday, November 10, 2010 8:20 PM
> To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list
> Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Hi Mayank,
>
> Could you elaborate on #1? What DLL characteristics would you want to
> see included?
>
> Thanks,
>
> Penny
>
> ======================================================================
> ==
> ===========
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Kirillov, Ivan A.
> Sent: Tuesday, November 09, 2010 3:15 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: PE Static Analysis Attributes?
>
> Hello Everyone,
>
> One of the things we're planning on adding to the next MAEC schema
> release is the capability of characterizing specific Windows PE binary

> attributes obtained through static analysis.
>
> We've taken a stab at compiling an initial list of these, but we'd
> appreciate your input on anything else that you think needs to be
> added (or removed!). Here's the list:
>
> -Strings
> -Linker/Version
> -Size of Code
> -Size of Initialized Data
> -Size of Uninitialized Data
> -Imports
> -Exports
> -Entry Point Address
> -Base Code Address
> -Base Data Address
> -Section Info
> *Name
> *Virtual Size
> *Virtual Address
> -Resources
>
> Also, we hope to post to the MAEC website a tracker for MAEC-schema
> related development issues, providing you with more insight into
> planned improvements/revisions, in the next few weeks. I'll keep you
posted.
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Working Group
> The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Kirillov, Ivan A.
All, thanks for the clarifications and comments.

Mike, I think your categories of initially visible packed/unpacked imports make good sense.

Scot, a particular persistence mechanism would likely be composed of several low-level actions that are linked to a MAEC behavior, although it's certainly worth considering on how to accurately define this link. However, the startup location/parameters are something we don't have at the moment, and I certainly agree that they should be included. This is also something we'll likely incorporate in the next schema release.

Here's the latest list of attributes:

-DLL Count
-PE Type
        *Dll/EXE/Other?
-Strings (minimum length = 3?)
        *Hash value(s)
        *Encoding (Unicode/ANSI)
-Headers
        *DOS Header
                *Hash value(s)
        *File Header
                *Hash value(s)
        *Optional Header
                *Hash value(s)
        *Section Header
                *Hash value(s)
-Linker/Version
-Size of Code
-Size of Image
-Size of Initialized Data
-Size of Uninitialized Data
-Size of Appended Data
-Version Info
        *File Version
        *Product Version
        *Company Name
-Digital Certificate Info
        *Validity
        *Issuer
-Imports
        *Initially visible
        *Initially visible delay-load
        *Initially visible packed
        *Initially visible packed delay-load
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-# of Sections
-Section Info
        *Name
        *Hash (header/section data)
        *Entropy
        *Virtual Size
        *Virtual Address
-Resources
        *Hash value(s)

With regards to packed vs. unpacked PE descriptions, I believe this could be a useful feature to have, although I'm not sure how to best structure it. I think Dan was hinting at a nested series of descriptions, something like:

*Packed Description
        *Unpacked Description 1
                *Unpacked Description 2
                ...
                        *Unpacked Description n

On the other hand, Mike suggested adding attributes for the purpose of supporting different import types. I think this second method would take less time to implement in the schema, although the first could still be useful for understanding the "structure" of the PE packing. Any further thoughts on this?

Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of McCarl, Michael
Sent: Monday, November 15, 2010 11:46 AM
To: Dan Waitman; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: EXTERNAL: RE: PE Static Analysis Attributes?

i don't think this moves us toward packed vs. unpacked descriptions
(although, that is something to think about).  all it suggests is that
attributes for import elements support several different types of
imports.

a possible scenario might be to define an "importtype" enumeration that
identifies the different types and allow it to be included in an import
definition.  an example of such an element might be:

<Import fileName="msvcrt.dll" importtype="delay-load"
ordinal="0">_getdrives</Import>


-----Original Message-----
From: Dan Waitman [mailto:[hidden email]]
Sent: Monday, November 15, 2010 8:50 AM
To: McCarl, Michael
Cc: Kirillov, Ivan A.; maec-discussion-list Malware Attribute
Enumeration Discussion
Subject: Re: EXTERNAL: RE: PE Static Analysis Attributes?

Now it sounds like we're getting to the point where we need to be able
to define a packed PE description and an unpacked PE description.
... and possibly even a series of unpacked PE's for the various stages
of unpacking given packers that decrypt a function at a time.

On 11/15/2010 8:11 AM, McCarl, Michael wrote:
> with respect to sections, there will be 2 hash values: the hash of the

> header and the hash of the section data it contains (except for
> uninitialized data sections). adding "Hash" under "Section Info"
> should cover this.
>
> i'd like to clarify the types of imports:  
>
>     1. the "basic" imports commonly known as simply "imports".  these
> are references
>        made to external libraries that are resolved at load time by
> the system loader.
>
>     2. "delay-load" imports. like basic imports, these are references
> made to external
>        libraries, but are not resolved at load time by the system
> loader. they are resolved
>        at run time when needed.  like the basic imports, there is a
> separate entry in the
>        pe optional header for delay-load imports.
>
>     3. "initially visible" imports.  this is a term i made up.  often
> when a pe file is packed,
>        the import tables are also included in the packer's compression

> mechanism making them
>        inaccessible to the loader.  to get the program to load/run, a
> new import table (usually
>        a minimal set of imports) is created consisting of the set of
> external functions the
>        unpacker requires to decompress the packed code. the unpack
> routine can then decompress
>        the original import tables and load them at run time.
> presumably, the unpack routine could
>        also initialize the delay-load import tables so they function
> as a normal delay-load import.
>
> combining all these categories yields the following:
>
>     initially visible imports
>     initially visible delay-load imports
>     initially hidden (packed) imports
>     initially hidden (packed) delay-load imports
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Kirillov, Ivan A.
> Sent: Friday, November 12, 2010 4:36 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Thanks for the input everyone. Here's the revised list:
>
> -DLL Count
> -PE Type
> *Dll/EXE/Other?
> -Strings (minimum length = 3?)
> *Hash value(s)
> *Encoding (Unicode/ANSI)
> -Headers
> *DOS Header
> *Hash value(s)
> *File Header
> *Hash value(s)
> *Optional Header
> *Hash value(s)
> *Section Header
> *Hash value(s)
> -Linker/Version
> -Size of Code
> -Size of Image
> -Size of Initialized Data
> -Size of Uninitialized Data
> -Size of Appended Data
> -Version Info
> *File Version
> *Product Version
> *Company Name
> -Digital Certificate Info
> *Validity
> *Issuer
> -Imports
> *Delay-load (yes/no)
> -Exports
> -Entry Point Address
> -Base Code Address
> -Base Data Address
> -# of Sections
> -Section Info
> *Name
> *Entropy
> *Virtual Size
> *Virtual Address
> -Resources
> *Hash value(s)
>
> As far as packers, we already have the following mechanism for their
> characterization, under File_System_Object_Attributes of the
> ObjectType
> - see
> http://maec.mitre.org/language/version1.01/xsddocs/MAEC/complexType/Ob
> je ctType.File_System_Object_Attributes.Packing.html
>
> Perhaps we should implement something similar for installers?
>
> I agree that we should be careful regarding strings, since there are
> bound to be a large number for most binaries. I think having a minimum

> length for inclusion makes sense, as Mike said. It's also worth
> considering how to best structure such data.
>
> Mike, do you think it's worth having both initially visible imports as

> well as those visible after unpack?
>
> Scot, are you referring to persistence with regards to malware startup

> (e.g. autorun registry key)? Or as Mike said, the initial launch
> conditions?
>
> Mayank, does the PE type category I added above cover the DLL/EXE
> attribute you referred to?
>
> Have a good weekend!
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Working Group
> The MITRE Corporation
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Mayank.2.Bhatnagar
> Sent: Friday, November 12, 2010 9:26 AM
> To: McCarl, Michael; maec-discussion-list Malware Attribute
> Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Hi Michael,
>
> You are right about the packer thing. There are many, yet we can still

> identify at least one which we can get to know what kind of packing
> has been done. For any multi-compressed malware, or if there is a
> trend like that, we may be able to identify it and attribute it.
>
> Hi Penny,
>
> By DLL Chars, I was mainly interested in knowing whether it is a
> DLL/EXE or anything else.
>
> Thanks & Regards,
> Mayank
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> McCarl, Michael
> Sent: Wednesday, November 10, 2010 9:57 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> because the pe structure is well documented, i think it's safe to
> include all the values from the dos header, file header, optional
> header and section headers.  additionally, hash values of each of
> these headers would be necessary. (btw, this would include dll
> characteristics as Mayank suggests)
>
> additionally, hash values for strings and resources would be useful.
> (we might want to specify that a "string" is of a specific minimum
> length to eliminate strings of 1 or 2 character).  also, we would need

> to know if the string is a unicode or ansi string.
>
> i think that information about the packer (if packed) is ok, but i
> think it's not as useful as it once was (mainly because of the number
> of packers).
>
> one item that isn't just a pe file attribute is the various names
> (aliases) the file has been known to use.  
>
> with respect to hash values, i'd like to comment that while there are
> lots of hash values we could use, most aren't very useful unless
> you're searching for other files that have the exact same hash value.

> if "context triggered piecewise hashing" (jesse kornblum,
> http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used,
> files (or portions thereof) could be compared for similarity instead
> of just simple exact matches.
>
> regarding imports, would this be the "initially visible" imports?
> packers usually hide the imports of a file and often show only the
> minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
> also, imports need to be distinguished as delay-load or not.  perhaps
> it would be useful to include the static data of the unpacked form of
> the file as well (or at least a reference to MAEC data about the
> unpacked form of the file).
>
> i believe Scot suggested information about how it starts up...  i can
> think of a couple interpretations of this: first, does this mean how
> the program is launched (e.g. click on it in explorer, an existing
> vulnerability, etc.) or; second, the command line parameters it
> uses/requires and other environment settings to make it function.  i'm

> not sure either comes from static analysis, but it might.
>
> btw: where/when is the next meeting to discuss MAEC?
>
> mike mccarl
>
>
> ======================================================================
> ==
> ===========
> From: Chase, Melissa P. [mailto:[hidden email]]
> Sent: Wednesday, November 10, 2010 8:20 PM
> To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list
> Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Hi Mayank,
>
> Could you elaborate on #1? What DLL characteristics would you want to
> see included?
>
> Thanks,
>
> Penny
>
> ======================================================================
> ==
> ===========
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Kirillov, Ivan A.
> Sent: Tuesday, November 09, 2010 3:15 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: PE Static Analysis Attributes?
>
> Hello Everyone,
>
> One of the things we're planning on adding to the next MAEC schema
> release is the capability of characterizing specific Windows PE binary

> attributes obtained through static analysis.
>
> We've taken a stab at compiling an initial list of these, but we'd
> appreciate your input on anything else that you think needs to be
> added (or removed!). Here's the list:
>
> -Strings
> -Linker/Version
> -Size of Code
> -Size of Initialized Data
> -Size of Uninitialized Data
> -Imports
> -Exports
> -Entry Point Address
> -Base Code Address
> -Base Data Address
> -Section Info
> *Name
> *Virtual Size
> *Virtual Address
> -Resources
>
> Also, we hope to post to the MAEC website a tracker for MAEC-schema
> related development issues, providing you with more insight into
> planned improvements/revisions, in the next few weeks. I'll keep you
posted.
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Working Group
> The MITRE Corporation
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: PE Static Analysis Attributes?

Kirillov, Ivan A.
Just realized that I mis-labeled the import category. It should read:

-Imports
        *Initially visible
        *Initially visible delay-load
        *Initially hidden packed
        *Initially hidden packed delay-load

Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Kirillov, Ivan A.
Sent: Friday, November 19, 2010 12:53 PM
To: maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: PE Static Analysis Attributes?

All, thanks for the clarifications and comments.

Mike, I think your categories of initially visible packed/unpacked imports make good sense.

Scot, a particular persistence mechanism would likely be composed of several low-level actions that are linked to a MAEC behavior, although it's certainly worth considering on how to accurately define this link. However, the startup location/parameters are something we don't have at the moment, and I certainly agree that they should be included. This is also something we'll likely incorporate in the next schema release.

Here's the latest list of attributes:

-DLL Count
-PE Type
        *Dll/EXE/Other?
-Strings (minimum length = 3?)
        *Hash value(s)
        *Encoding (Unicode/ANSI)
-Headers
        *DOS Header
                *Hash value(s)
        *File Header
                *Hash value(s)
        *Optional Header
                *Hash value(s)
        *Section Header
                *Hash value(s)
-Linker/Version
-Size of Code
-Size of Image
-Size of Initialized Data
-Size of Uninitialized Data
-Size of Appended Data
-Version Info
        *File Version
        *Product Version
        *Company Name
-Digital Certificate Info
        *Validity
        *Issuer
-Imports
        *Initially visible
        *Initially visible delay-load
        *Initially visible packed
        *Initially visible packed delay-load
-Exports
-Entry Point Address
-Base Code Address
-Base Data Address
-# of Sections
-Section Info
        *Name
        *Hash (header/section data)
        *Entropy
        *Virtual Size
        *Virtual Address
-Resources
        *Hash value(s)

With regards to packed vs. unpacked PE descriptions, I believe this could be a useful feature to have, although I'm not sure how to best structure it. I think Dan was hinting at a nested series of descriptions, something like:

*Packed Description
        *Unpacked Description 1
                *Unpacked Description 2
                ...
                        *Unpacked Description n

On the other hand, Mike suggested adding attributes for the purpose of supporting different import types. I think this second method would take less time to implement in the schema, although the first could still be useful for understanding the "structure" of the PE packing. Any further thoughts on this?

Regards,
Ivan

Ivan Kirillov
MAEC Working Group
The MITRE Corporation

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of McCarl, Michael
Sent: Monday, November 15, 2010 11:46 AM
To: Dan Waitman; maec-discussion-list Malware Attribute Enumeration Discussion
Subject: RE: EXTERNAL: RE: PE Static Analysis Attributes?

i don't think this moves us toward packed vs. unpacked descriptions
(although, that is something to think about).  all it suggests is that
attributes for import elements support several different types of
imports.

a possible scenario might be to define an "importtype" enumeration that
identifies the different types and allow it to be included in an import
definition.  an example of such an element might be:

<Import fileName="msvcrt.dll" importtype="delay-load"
ordinal="0">_getdrives</Import>


-----Original Message-----
From: Dan Waitman [mailto:[hidden email]]
Sent: Monday, November 15, 2010 8:50 AM
To: McCarl, Michael
Cc: Kirillov, Ivan A.; maec-discussion-list Malware Attribute
Enumeration Discussion
Subject: Re: EXTERNAL: RE: PE Static Analysis Attributes?

Now it sounds like we're getting to the point where we need to be able
to define a packed PE description and an unpacked PE description.
... and possibly even a series of unpacked PE's for the various stages
of unpacking given packers that decrypt a function at a time.

On 11/15/2010 8:11 AM, McCarl, Michael wrote:
> with respect to sections, there will be 2 hash values: the hash of the

> header and the hash of the section data it contains (except for
> uninitialized data sections). adding "Hash" under "Section Info"
> should cover this.
>
> i'd like to clarify the types of imports:  
>
>     1. the "basic" imports commonly known as simply "imports".  these
> are references
>        made to external libraries that are resolved at load time by
> the system loader.
>
>     2. "delay-load" imports. like basic imports, these are references
> made to external
>        libraries, but are not resolved at load time by the system
> loader. they are resolved
>        at run time when needed.  like the basic imports, there is a
> separate entry in the
>        pe optional header for delay-load imports.
>
>     3. "initially visible" imports.  this is a term i made up.  often
> when a pe file is packed,
>        the import tables are also included in the packer's compression

> mechanism making them
>        inaccessible to the loader.  to get the program to load/run, a
> new import table (usually
>        a minimal set of imports) is created consisting of the set of
> external functions the
>        unpacker requires to decompress the packed code. the unpack
> routine can then decompress
>        the original import tables and load them at run time.
> presumably, the unpack routine could
>        also initialize the delay-load import tables so they function
> as a normal delay-load import.
>
> combining all these categories yields the following:
>
>     initially visible imports
>     initially visible delay-load imports
>     initially hidden (packed) imports
>     initially hidden (packed) delay-load imports
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Kirillov, Ivan A.
> Sent: Friday, November 12, 2010 4:36 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Thanks for the input everyone. Here's the revised list:
>
> -DLL Count
> -PE Type
> *Dll/EXE/Other?
> -Strings (minimum length = 3?)
> *Hash value(s)
> *Encoding (Unicode/ANSI)
> -Headers
> *DOS Header
> *Hash value(s)
> *File Header
> *Hash value(s)
> *Optional Header
> *Hash value(s)
> *Section Header
> *Hash value(s)
> -Linker/Version
> -Size of Code
> -Size of Image
> -Size of Initialized Data
> -Size of Uninitialized Data
> -Size of Appended Data
> -Version Info
> *File Version
> *Product Version
> *Company Name
> -Digital Certificate Info
> *Validity
> *Issuer
> -Imports
> *Delay-load (yes/no)
> -Exports
> -Entry Point Address
> -Base Code Address
> -Base Data Address
> -# of Sections
> -Section Info
> *Name
> *Entropy
> *Virtual Size
> *Virtual Address
> -Resources
> *Hash value(s)
>
> As far as packers, we already have the following mechanism for their
> characterization, under File_System_Object_Attributes of the
> ObjectType
> - see
> http://maec.mitre.org/language/version1.01/xsddocs/MAEC/complexType/Ob
> je ctType.File_System_Object_Attributes.Packing.html
>
> Perhaps we should implement something similar for installers?
>
> I agree that we should be careful regarding strings, since there are
> bound to be a large number for most binaries. I think having a minimum

> length for inclusion makes sense, as Mike said. It's also worth
> considering how to best structure such data.
>
> Mike, do you think it's worth having both initially visible imports as

> well as those visible after unpack?
>
> Scot, are you referring to persistence with regards to malware startup

> (e.g. autorun registry key)? Or as Mike said, the initial launch
> conditions?
>
> Mayank, does the PE type category I added above cover the DLL/EXE
> attribute you referred to?
>
> Have a good weekend!
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Working Group
> The MITRE Corporation
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Mayank.2.Bhatnagar
> Sent: Friday, November 12, 2010 9:26 AM
> To: McCarl, Michael; maec-discussion-list Malware Attribute
> Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Hi Michael,
>
> You are right about the packer thing. There are many, yet we can still

> identify at least one which we can get to know what kind of packing
> has been done. For any multi-compressed malware, or if there is a
> trend like that, we may be able to identify it and attribute it.
>
> Hi Penny,
>
> By DLL Chars, I was mainly interested in knowing whether it is a
> DLL/EXE or anything else.
>
> Thanks & Regards,
> Mayank
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> McCarl, Michael
> Sent: Wednesday, November 10, 2010 9:57 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> because the pe structure is well documented, i think it's safe to
> include all the values from the dos header, file header, optional
> header and section headers.  additionally, hash values of each of
> these headers would be necessary. (btw, this would include dll
> characteristics as Mayank suggests)
>
> additionally, hash values for strings and resources would be useful.
> (we might want to specify that a "string" is of a specific minimum
> length to eliminate strings of 1 or 2 character).  also, we would need

> to know if the string is a unicode or ansi string.
>
> i think that information about the packer (if packed) is ok, but i
> think it's not as useful as it once was (mainly because of the number
> of packers).
>
> one item that isn't just a pe file attribute is the various names
> (aliases) the file has been known to use.  
>
> with respect to hash values, i'd like to comment that while there are
> lots of hash values we could use, most aren't very useful unless
> you're searching for other files that have the exact same hash value.

> if "context triggered piecewise hashing" (jesse kornblum,
> http://www.dfrws.org/2006/proceedings/12-Kornblum.pdf) were used,
> files (or portions thereof) could be compared for similarity instead
> of just simple exact matches.
>
> regarding imports, would this be the "initially visible" imports?
> packers usually hide the imports of a file and often show only the
> minimal imports of GetProcAddress and LoadLibrary from kernel32.dll.
> also, imports need to be distinguished as delay-load or not.  perhaps
> it would be useful to include the static data of the unpacked form of
> the file as well (or at least a reference to MAEC data about the
> unpacked form of the file).
>
> i believe Scot suggested information about how it starts up...  i can
> think of a couple interpretations of this: first, does this mean how
> the program is launched (e.g. click on it in explorer, an existing
> vulnerability, etc.) or; second, the command line parameters it
> uses/requires and other environment settings to make it function.  i'm

> not sure either comes from static analysis, but it might.
>
> btw: where/when is the next meeting to discuss MAEC?
>
> mike mccarl
>
>
> ======================================================================
> ==
> ===========
> From: Chase, Melissa P. [mailto:[hidden email]]
> Sent: Wednesday, November 10, 2010 8:20 PM
> To: Mayank.2.Bhatnagar; Kirillov, Ivan A.; maec-discussion-list
> Malware Attribute Enumeration Discussion
> Subject: RE: PE Static Analysis Attributes?
>
> Hi Mayank,
>
> Could you elaborate on #1? What DLL characteristics would you want to
> see included?
>
> Thanks,
>
> Penny
>
> ======================================================================
> ==
> ===========
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of
> Kirillov, Ivan A.
> Sent: Tuesday, November 09, 2010 3:15 PM
> To: maec-discussion-list Malware Attribute Enumeration Discussion
> Subject: PE Static Analysis Attributes?
>
> Hello Everyone,
>
> One of the things we're planning on adding to the next MAEC schema
> release is the capability of characterizing specific Windows PE binary

> attributes obtained through static analysis.
>
> We've taken a stab at compiling an initial list of these, but we'd
> appreciate your input on anything else that you think needs to be
> added (or removed!). Here's the list:
>
> -Strings
> -Linker/Version
> -Size of Code
> -Size of Initialized Data
> -Size of Uninitialized Data
> -Imports
> -Exports
> -Entry Point Address
> -Base Code Address
> -Base Data Address
> -Section Info
> *Name
> *Virtual Size
> *Virtual Address
> -Resources
>
> Also, we hope to post to the MAEC website a tracker for MAEC-schema
> related development issues, providing you with more insight into
> planned improvements/revisions, in the next few weeks. I'll keep you
posted.
>
> Regards,
> Ivan
>
> Ivan Kirillov
> MAEC Working Group
> The MITRE Corporation
Loading...