Loading

CognitiveCrop™

Product and Solution Overview

CognitiveCrop™ is a Cognitive Mill product that offers you a game-changing solution on how to quickly adapt horizontal videos to the vertical mode to fit in the social media format and keep the main objects in focus.

In our Automated Cropping to Portrait Mode Solution, we’ve achieved a human-like focus based on motion perception. The system finds and focuses on the most important parts of the video and leaves them within the frame.

In cropped videos, the frame within camera operator shots moves smoothly, which creates a better user experience.

Our cropping method describes the X coordinate of the cropping frame’s top-left corner and its width in each frame inside shots.

The cropping frame is adapted to the 9:16 aspect ratio that is used in popular social media, such as TikTok and Instagram Stories. This value is constant and equals 0.5625.

After running the parent Crop meta process, you get a meta.json file that contains all the required metadata for further processing with MediaMill™ or with third-party software.

Benefits:

  • You get a JSON file with all the required metadata of how to crop your horizontal video to the vertical mode without losing the core objects and actions.
  • CognitiveCrop™ automatically processes any type of content without additional configurations. You can crop sports events, movies, series, TV shows, or news.
  • The cropping frame moves smoothly within every operator shot, which creates a positive viewer experience.
  • You can interact with the platform either via the UI or via the API.
  • You can easily integrate the CognitiveCrop™ solution with third-party software or systems via the API.
  • The video cropping process takes several minutes, and then you can immediately distribute your cropped video on social media.

Let’s review what is included in the output meta.json file after running the Crop meta process.

Output Metadata

A meta.json file is the core file that contains metadata of the video for further processing either with third-party systems and software or with Cognitive Mill by running child processes, like generating the cropped video with MediaMill™ for instant social media distribution.

A Crop meta JSON file you get after running the Crop meta process is essential for video cropping automation. The file describes video segments of the type shot and includes the following cropping metadata:

  • Type. The type of a segment, which is a shot for the Crop meta process. In CognitiveCrop™, we provide only shots because splitting to subshots is redundant in this case.
  • Start. The start time of the segment. Ms stands for milliseconds and shows when the episode appears in the video timeline.
  • End. The end time of the segment. Ms stands for milliseconds and shows when the episode appears in the video timeline.
  • Repr_ms. The time marker of the representative frame within the current shot. The frame that describes it best.
  • Crop elements including:
    - Crop_resolution. The frame resolution with the constant value of 0.5625 (9:16 aspect ratio).
    - Ms_data. The time marker of the crop_x in milliseconds.
    Crop_x is the value that describes the coordinate of the top-left corner of the cropping frame within the video frames for the given moments of time within a shot. This value is given as a percentage of the whole video frame’s width and can be 0 to 1.

This data is enough to get the coordinates of all corners.

For example:

When the resolution of the original video is 1024x768px, the coordinates of the cropping frame’s corners will be the following:

The top-left corner [1024*crop_x, 0].

The bottom-left corner [1024*crop_x, 768].

The top-right corner [1024*crop_x + 768*crop_resolution, 0].

The bottom-right corner [1024*crop_x + 768*crop_resolution, 768].

Note that you cannot change the crop_resolution value yourselves to prevent the cropping frame from moving outside the borders of the original video frame.

Contact us if you need a different cropping frame resolution.

Below is an example of a Crop meta.json file.

{
  "segments": [
    {
      "crop": {
        "crop_resolution": 0.5625,
        "ms_data": {
          "1000.0": {
            "crop_x": 0.35694444444444445
          },
          "1040.0": {
            "crop_x": 0.3541666666666667
          },
          "1080.0": {
            "crop_x": 0.35
          },
          "1120.0": {
            "crop_x": 0.34305555555555556
          },
          "1160.0": {
            "crop_x": 0.33611111111111114
          }
        }
      },
      "end": {
        "ms": 5000
      },
      "repr_ms": 4240,
      "start": {
        "ms": 0
      },
      "type": "shot"
     } 
  ]
}  

Demo Case

This guide shows you how you can crop your horizontal video to fit the vertical aspect ratio of social networks via the UI at run.cognitivemill.com.

Before cropping your video:

1. Sign in to your account or register if it’s your first visit.

2. Make sure you have the Crop meta quota to get your video processed.

New users are provided with trial quotas. If you don’t have the required quotas, contact us at support@aihunters.com to get them.

Now you're all set.

1. Click Run Process. on the top navigation bar.

The Run a New Process page opens.

2. Select Crop meta from the drop-down list of the Process type field.
Note that you can select only those processes for which you have quotas.

3. In the Title field, enter a name for your cropping process.

4. In the Video source field, either paste a link to the video you want to process — the default Use video link option — or click the field > select Use video file > click Select file > pick a file from your device to upload.

5. (Optionally) Clear the checkbox to cancel the creation of a transcoded proxy file. The checkbox is selected by default.
A transcoded proxy file is lightweight, so it is easier and faster for the visualizer to open it.

6. Click the Run Process button.
The processing of the video has started. You can follow the progress on the Process List page, where your process appears in the current status.

When the status changes to Completed, you can:

  • Download metadata for further processing with third-party systems and software.
  • Preview the cropping frame in the Cognitive Mill visualizer and generate the cropped video with MediaMill™.

To download metadata:

1. On the Process List page, click the three vertical dots next to the process.

2. In the pop-up menu that appears, click Get meta.json.

The meta.json file has been downloaded to your device.

To preview what gets within the cropping frame:

1. On the Process List page, click the title of the process to open the Cognitive Mill visualizer.

2. Click the Play button under the video.

To crop the video:

1. On the Process List page, click the title of the process to open the Cognitive Mill visualizer.

2. Click Add to Editor above the shot timeline under the video to add the segments to Editor.

3. Click Run Media Mill.
The Media mill page opens.

4. Enter a title for your process.

5. (Optionally) Select the checkbox to create a lightweight transcoded proxy file, which is easier and faster for the visualizer to open.

6. Click Run Process.
The process appears at the bottom of the page in its current status.

When the status of the process changes to Completed, you can download the cropped video to your device.

To download the cropped video:
1. Click the three vertical dots icon next to the completed process.

2. In the pop-up menu that opens, click Get out media. The video opens in a separate tab for preview.

3. Click the three vertical dots icon in the bottom-right corner of the video.

4. In the pop-up menu that opens, click Download.

The video is downloaded to your device.

Current Challenges

  • Abrupt movements.
    Extremely fast and abrupt movements can be ignored by the system because it may be unable to catch them. Though such movements are not emphasized or essential for the main action, we are already working on the solution.
  • Zoomed objects.
    The system may focus on big zoomed static objects and ignore smaller moving objects, while a human eye would catch the moving object first. It happens because the system is misguided by the operator’s camera focus and the zoomed object on which the camera is focused. But we keep on improving the robot’s vision until it fully imitates the human focus.
  • Static characters in the dynamic environment.
    When the environment is more dynamic than the character, the robot’s eyes may focus on the scenery. Heavy rain or snow, or something catching, like flickering lights, may confuse the robot’s eyes and make the system put these dynamic parts within the frame and leave the main character standing outside.
  • Two or more dynamic characters moving simultaneously.
    This case is super challenging because of the social media format limitations. When two main characters are moving at the same time within a short period of time, we cannot show both of them because the frame’s size cannot change. And we also cannot switch from one object to the other very fast to guarantee a smooth and convenient viewing experience. So we have to choose one object and focus on it.

We keep on working on the most efficient ways how to overcome the current challenges so that in the following versions of CognitiveCrop™ we won’t face them anymore.


We use cookies to ensure that we give you the best experience on our website. Read cookies policies.