Design Wall for Meeting Rooms
(Multimedia Computing Project [V0.97])

[Please read Notes below.]

Important dates

The project and report requirements are presented next. There are some open options and additional or alternative features are encouraged. Examples include adding 3D support for display or a precise face tracker.

Goals

A design wall for meeting rooms with a large shared screen is a collaborative workspace for brainstorming, ideation, and creative collaboration among team members. The screen is a a focal point for collaborative design activities, e.g., for products. Participants can collaborate in real-time, and the design is displayed on the shared screen. This enables iterative design processes, where team members can collectively refine and iterate on design concepts, while also providing feedback and suggestions to one another.

The project goal is then to build a system to control the interaction with the content on the screen. It will be able to display and control images and video in order to see design alternatives, previous approaches or new ideas. Face detection technology can identify participants as they approach the screen, allowing for personalized greetings, or user-specific interactions. It can also be used for analyzing audience reactions and engagement levels during presentations, allowing to adjust content accordingly. Object recognition can be used to detect and analyze physical objects or documents placed in front of the screen, triggering relevant digital content or actions. Simple gesture recognition technology is used for  interaction with the screen without physical input devices.


Interaction with the Design Wall

Interaction will be done through a video camera and touch (could be simulated using a mouse).

It should be possible to detect, at least, whether there is motion, simple hand gestures, and if there are people watching.

In the display, is should be possible to visualize images and videos in a grid. Playing the video or displaying the image in high resolution should also be possible.

Some ideas for display and interaction could be seen in the following videos and images:

https://www.clevelandart.org/artlens-gallery/artlens-wall

https://www.youtube.com/watch?v=yUNGUNHPqCM

http://tabler.tv/video-wall.html


Metadata

Each image and video clip should have the following metadata. This metadata is used to compare images. Metadata extraction could be implemented as standalone program or as part of the display and interaction system.

This metadata can be stored in an XML file, so that it is only necessary to process each image or video once. Additionally, the list of media items for a user or project could also be listed as an XML file.

Development and Library Support

In order to support access to media content, openFrameworks is the suggested framework. The videoPlayer and dirListExample are good starting points for the project. The opencvHaarFinderExample and the cameraLensOffsetExample show how to integrate OpenCV with openFrameworks and also use face detection. The additional ofxCv addon is also useful. See example below.

To integrate XML files (for example to store/retrieve metadata) the ofxXmlSettings and the sample application could be used. In order to build the User Interface ofxGui can be used. This code is included in the distribution but there are others in the openFrameworks site (addons).


Using OpenCV (Image Processing)

// Example using the OfxCv addon
//
https://github.com/kylemcdonald/ofxCv


#include "ofxCvHaarFinder.h"
#include "ofxCv.h"

using namespace cv;
using namespace ofxCv;

void c::f()
{
    ofxCvColorImage img;
    img.setFromPixels(pixelData);
    cv::Mat m = toCv(img.getPixels()) ;
    cv::InputArray src(m) ; // if needed
}


Edge distribution

Edge distribution can be obtained by using one or more edge filters (see slides) and counting or averaging the results. An edge histogram is a good way to represent the edge distribution. Using OpenCV edge filters can be applied with:
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_DEFAULT );

Texture characteristics

Texture can be evaluated using Gabor filters. Currenty, Gabor filters are supported by OpenCV. The kernels (filters) can be generated with getGaborKernel and the filters are applied with filter2D. There are options for the parameters depending on the desired result. Several orientations could be used and frequencies/wavelengths resulting in a bank with multiple filters (for example 6 orientations and 4 frequencies).

Number of times a specific object (input as an image) appears in the image or in the video

To compare images, keypoint based methods usually give good results in image matching and
OpenCV includes some of these methods (features2D). There are several examples using this framework in OpenCV that include the different processing stages: 1) keypoint detection; 2) descriptor extraction; 3) descriptor matching. The matchersimple example and description is a good starting point to understand the process. This example shows the new interface and the several detectors. There are different ways to detect the keypoints and extract the descriptors, including SIFT, SURF, ORB and FAST, that lead to different results and processing needs. A simple and effective approach is to base the code on the matchersimple example but using ORB instead of SURF, as SURF is part of the extended/non free features package (xfeatures2d).


Deliverable 1: Specification and Related Work Study 

This is the initial specification of the system and a study of related work. Enhancing multimedia information with metadata to improve browse and search operations was explored by other authors in different situations and contexts. Some of these systems use advanced image and  video processing algorithms. The main goal of this study is to provide an overview of the state of the art in this area. The results should be summarized and discussed in a short paper (max 4 pages + images in annex if needed, optional format here: https://www.acm.org/publications/proceedings-template). The paper should have the following structure: 

Deliverable 2: Final Report + Code

The following structure is suggested for the final report (up to 6 pages), including part of the content from the initial specification and related work:

The report should also include as appendix:


Notes [Updated 2/5/2024.]

2/5:

In order to implement what is required face recognition is not mandatory but some of the projects that you submitted, have work with face recognition - for example, when someone approaches the screen, recognition could be used to exit a screensaver and load the work space for that person. There are many ways to do face recognition and we can use what is provided by OpenCV. There is an addon but it is also possible to use it directly from OpenCV.

Meanwhile, do not forget that the list below should have been implemented by now!


8/4: Until 19/4 your project should include at least:

1 - A gallery with images and video (Examples: dirListExample, videoPlayerExample).

2 - The capability to play videos and show images full screen (Example: videoPlayerExample).

3 - The ability to show (and hide using a key) the camera capturing images (Example: videoGrabberExample).

4 - Face detection on the camera image (Example: opencvHaarfFinderExample).

5 - Read and write xml - one xml file for each image or video file with metadata (Example: xmlSettingsExample).

Needed later:

1 - At least one pixel processing algorithm (color or light) applied to the images and the result stored as metadata (Example: videoGrabberExample).

2 - Simple motion detection using the camera (Example: opencvExample).


22/3: Introduction to openFrameworks and the project. Experimenting the videoPlayer and the dirListExample. Both these examples are in the distribution in the /examples folder - in the video and input-output folder. The class structure should be planned from the beginning.

15/3: Introduction to openFrameworks and the project. Starting to experiment using the videoPlayer and the dirListExample. Both these examples are in the distribution in the /examples folder - in the video and input-output folders. Together, these two examples support some of the requirements in the project - namely displaying image and video files and listing a folder with this type of content. Both examples follow the standard OF model with setup(), update() and draw() methods. Also included event handlers, for example to handle key press or mouse move. The goal is to continue the example by displaying a gallery of images and videos - and in the process use the tests to specify the interface. It should be possible to play, pause, resume the videos.

The goal for this class is to display images and video in the same window - like a gallery with rows and columns of image and video thumbnails.