Sergey Karayev

I recently finished a PhD in Computer Science at UC Berkeley, advised by Trevor Darrell.
I am now building the universal knowledge assessment engine at Gradescope, starting with college STEM courses.
Check out my CV, LinkedIn, Google Scholar, Github, or email me. .


  • Recognizing Image Style

    Image style is integral to visual communication, but has received scant research attention. We gather datasets of photo and painting style, and evaluate convolutional neural nets for the task.

  • Anytime Visual Recognition

    Features have different costs and different classes benefit from different features. A multi-class recognition system should dynamically select them to maximize performance under a cost budget.

  • Depth-informed Object Detection

    Using the Microsoft Kinect, we gather a large dataset of indoor crowded scenes. We investigate ways to unify state-of-the-art object detection systems and improve them with depth information.

  • Probabilistic Local Image Features

    Our method for additively decomposing local image patches shows best performance on a novel transparent object recognition dataset. We extend the model to multiple layers and apply it to general object classification.

  • Foveal Explorer

    A JavaScript applet for exploring images “foveally,” by moving a high-resolution area around. Written to gather visual attention data on Amazon Mechanical Turk.

  • Multi-color image search

    We present an open-source system for quickly searching large image collections by multiple colors given as a palette, or by color similarity to a query image.

  • CabFriendly

    A cloud-based mobile web application to match up users who request similar trips and would like to share a cab. The application is hosted on EC2 and combines several open-source frameworks with social networking and location-awareness APIs.

  • Virtual Zoom - [video] [pdf]

    With our application, the user can zoom in on a distant landmark using other people’s photographs. This work builds on a 3D scene modeling back end that infers the viewpoint of each photograph in an unordered collection (Photo Tourism).


Other stuff