The presence of rapidly increasing visual data adds importance to the computer vision studies for automatic analysis and interpretation of content. Although the nervous and sensory systems in humans easily perform the processes such as understanding and recognizing activities that take place on a stage, these processes are among the most challenging research topics of computer vision. The activities vary according to the number of participants. For instance, a single person can perform activities consisting of various atomic actions. In the scenes with more than one person, interactions occur between people. Since interactions are mutual movements between multiple people, both temporal changes in the scene and the spatial structures need to be modeled for analysis. In this study, long short term memory networks and support vector machines, based on the positions and distances of human body joints, are trained for the automated classification of actions and interactions.