Malware research - Knowledge

242:. The origins of the idea to use system calls to analyze software can be found in the work of Forrest et al. Christodorescu et al. point out that malware authors cannot easily reorder system calls without changing the semantics of the program, which makes system call dependency graphs suitable for malware detection. They compute a difference between malware and goodware system call dependency graphs and use the resulting graphs for detection, achieving high detection rates. Kolbitsch et al. pre-compute symbolic expressions and evaluate them on the syscall parameters observed at runtime. 265:

Research in combining static and dynamic malware analysis techniques is also currently being conducted in an effort to minimize the shortcomings of both. Studies by researchers such as Islam et al. are working to integrate static and dynamic techniques in order to better analyze and classify malware

245:

They detect dependencies by observing whether the result obtained by evaluation matches the parameter values observed at runtime. Malware is detected by comparing the dependency graphs of the training and test sets. Fredrikson et al. describe an approach that uncovers distinguishing features in

28:

experimented with computer viruses and confirmed Neumann's postulate and investigated other properties of malware such as detectability and self-obfuscation using rudimentary encryption. His 1988 Doctoral dissertation was on the subject of computer viruses.

100:, which has been applied in the study of biological virus. Various virus propagation scenarios have been studied by researchers such as propagation of computer virus, fighting virus with virus like predator codes, effectiveness of patching etc. 40:. This problem must not be mistaken for that of determination within a broad class of programs that a virus is not present. This problem differs in that it does not require the ability to recognize all viruses. 89:(assuming there are no backups). Analysis of the virus reveals the public key, not the IV and SK needed for decryption, or the private key needed to recover the IV and SK. This result was the first to show that 293:

John von Neumann, "Theory of Self-Reproducing Automata", Part 1: Transcripts of lectures given at the University of Illinois, December 1949, Editor: A. W. Burks, University of Illinois, USA, 1966.

429:

R. Islam, R. Tian, L. M. Batten, and S. Versteeg: Classification of malware based on integrated staic and dynamic features, Journal of Network Computer Applications, 2013, p. 646-656.

200: 158: 368:, Proc. of the 6th joint meeting of the European software engineering conf. and the ACM SIGSOFT symp. on The foundations of software engineering, 2007, p. 5-14 240: 220: 320:

A. Young, M. Yung, "Cryptovirology: Extortion-Based Security Threats and Countermeasures," IEEE Symposium on Security & Privacy, pp. 129-141, 1996.

407:

in Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD’08). New York, NY, USA: ACM Press, 2008, pp. 433-444

329:

H. Toyoizumi, A. Kara. Predators: Good Will Mobile Codes Combat against Computer Viruses. Proc. of the 2002 New Security Paradigms Workshop, 2002

16:

The notion of a self-reproducing computer program can be traced back to initial theories about the operation of complex automata.

103:

Behavioral malware detection has been researched more recently. Most approaches to behavioral detection are based on analysis of

44: 21: 311:

L. M. Adleman, "An Abstract Theory of Computer Viruses", Advances in Cryptology – --Crypto '88, LNCS 403, pp. 354-374, 1988.

81:

data on the victim's machine using the randomly generated IV and SK. The IV+SK are then encrypted using the virus writer's

96:

A growing area of computer virus research is to mathematically model the infection behavior of worms using models such as

250:

and leap mining. Babic et al. recently proposed a novel approach for both malware detection and classification based on

262:

from dependency graphs, and they show how such an automaton could be used for detection and classification of malware.

222:(either directly as a result or indirectly through output parameters) is later used as a parameter of system call 90: 36:, presented a rigorous proof that, in the general case, algorithmic determination of the presence of a virus is 59:

is ideal in constructing a virus that is highly resistant to reverse-engineering by presenting the notion of a

48: 339: 85:. In theory the victim must negotiate with the virus writer to get the IV+SK back in order to decrypt the 275: 97: 378: 20:

showed that in theory a program could reproduce itself. This constituted a plausibility result in

247: 167: 302:

Fred Cohen, "Computer Viruses", PhD Thesis, University of Southern California, ASP Press, 1988.

125: 67: 420:, in Proceedings of the 23rd Int. Conference on Computer Aided Verification, 2011, Springer. 417: 8: 37: 225: 205: 63:. A cryptovirus is a virus that contains and uses a public key and randomly generated 251: 444: 342:, IEEE/IIU Proc. of ICCCE '06, Kuala Lumpur, Malaysia, May 2006, Vol-I, p. 157-163. 64: 17: 52: 33: 352: 246:

malware system call dependency graphs. They extract significant behaviors using

255: 120: 112: 60: 438: 404: 365: 394:, Proc. of the 2010 IEEE Symposium on Security and Privacy, 2010, p. 45-60. 56: 392:

Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors

93:

can be used to devise malware that is robust against reverse-engineering.

55:. Ironically, it was later shown by Young and Yung that Adleman's work in 355:, Proc. of the 1996 IEEE Symp. on Security and Privacy, 1996, p. 120-129. 161: 116: 104: 71: 391: 86: 82: 25: 377:

C. Kolbitsch, P. Milani, C. Kruegel, E. Kirda, X. Zhou, and X. Wang:

259: 78: 351:

S. Forrest, S. A. Hofmeyr, A. Somayaji, T. A. Longstaff, Thomas A.:

390:

M. Fredrikson, S. Jha, M. Christodorescu, R. Sailer, and X. Yan:

108: 77:

In the cryptoviral extortion attack, the virus hybrid encrypts

379:

Effective and Efficient Malware Detection at the End Host

43:

Adleman's proof is perhaps the deepest result in malware

107:

dependencies. The executed binary code is traced using

228: 208: 170: 128: 234: 214: 194: 152: 164:, and edges represent dependencies. For example, 436: 405:Mining significant graph patterns by leap search 418:Malware Analysis with Tree Automata Inference 381:, The 18th USENIX Security Symposium, 2009. 366:Mining specifications of malicious behavior 340:Model-Based Analysis of Two Fighting Worms 403:X. Yan, H. Cheng, J. Han, and P. S. Yu: 115:to compute data-flow dependencies among 364:M. Christodorescu, S. Jha, C. Kruegel: 437: 202:if a result returned by system call 416:D. Babic, D. Reynaud, and D. Song: 13: 353:A Sense of Self for Unix Processes 14: 456: 338:Zakiya M. Tamimi, Javed I. Khan, 423: 410: 397: 384: 91:computational complexity theory 371: 358: 345: 332: 323: 314: 305: 296: 287: 183: 171: 147: 135: 1: 281: 7: 276:History of computer viruses 269: 258:. Their approach infers an 10: 461: 195:{\displaystyle (s,t)\in E} 49:Cantor's diagonal argument 47:to date and it relies on 32:Cohen's faculty advisor, 98:Lotka–Volterra equations 153:{\displaystyle G=(V,E)} 266:and malware variants. 236: 216: 196: 154: 237: 217: 197: 155: 68:initialization vector 226: 206: 168: 160:such that nodes are 126: 45:computability theory 22:computability theory 232: 212: 192: 150: 119:. The result is a 252:grammar inference 235:{\displaystyle t} 215:{\displaystyle s} 452: 430: 427: 421: 414: 408: 401: 395: 388: 382: 375: 369: 362: 356: 349: 343: 336: 330: 327: 321: 318: 312: 309: 303: 300: 294: 291: 248:concept analysis 241: 239: 238: 233: 221: 219: 218: 213: 201: 199: 198: 193: 159: 157: 156: 151: 111:or more precise 65:symmetric cipher 18:John von Neumann 460: 459: 455: 454: 453: 451: 450: 449: 435: 434: 433: 428: 424: 415: 411: 402: 398: 389: 385: 376: 372: 363: 359: 350: 346: 337: 333: 328: 324: 319: 315: 310: 306: 301: 297: 292: 288: 284: 272: 227: 224: 223: 207: 204: 203: 169: 166: 165: 127: 124: 123: 53:halting problem 51:as well as the 34:Leonard Adleman 12: 11: 5: 458: 448: 447: 432: 431: 422: 409: 396: 383: 370: 357: 344: 331: 322: 313: 304: 295: 285: 283: 280: 279: 278: 271: 268: 231: 211: 191: 188: 185: 182: 179: 176: 173: 149: 146: 143: 140: 137: 134: 131: 121:directed graph 113:taint analysis 9: 6: 4: 3: 2: 457: 446: 443: 442: 440: 426: 419: 413: 406: 400: 393: 387: 380: 374: 367: 361: 354: 348: 341: 335: 326: 317: 308: 299: 290: 286: 277: 274: 273: 267: 263: 261: 257: 256:tree automata 253: 249: 243: 229: 209: 189: 186: 180: 177: 174: 163: 144: 141: 138: 132: 129: 122: 118: 114: 110: 106: 101: 99: 94: 92: 88: 84: 80: 75: 73: 69: 66: 62: 58: 54: 50: 46: 41: 39: 35: 30: 27: 23: 19: 425: 412: 399: 386: 373: 360: 347: 334: 325: 316: 307: 298: 289: 264: 244: 162:system calls 117:system calls 102: 95: 76: 57:cryptography 42: 31: 15: 105:system call 72:session key 61:cryptovirus 38:undecidable 282:References 87:ciphertext 83:public key 26:Fred Cohen 260:automaton 187:∈ 79:plaintext 70:(IV) and 439:Category 270:See also 445:Malware 109:strace 74:(SK). 254:of 441:: 24:. 230:t 210:s 190:E 184:) 181:t 178:, 175:s 172:( 148:) 145:E 142:, 139:V 136:( 133:= 130:G

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Index