Infinitives After Verbs of Perception: See, Hear, Watch

Perception verbs infinitive versus -ing distinctionThis article explains how perception verbs like see, hear, watch, notice, and feel work with the bare infinitive, the main sentence patterns, and when to use the -ing form instead. It shows meaning changes, common mistakes, and includes practice exercises.

When describing what you perceive with your senses, English often uses see, hear, or watch plus an infinitive to show the action you noticed, as in seeing someone cross the street or hearing a neighbor start to sing. With practice, you can choose between the infinitive and the -ing form to reflect whether you noticed a complete action or only part of it.

How verbs of perception work with the bare infinitive

With verbs like see, hear, watch, notice, and feel, English often uses an object + base verb (no to) to show that you perceived a complete action or the whole event from start to finish.

Core pattern

  • Structure: verb of perception + object + bare infinitive
  • Meaning: the observer experiences the action as a complete unit (the action is viewed as “whole,” not just in progress).
  • Form note: the bare infinitive is the base form (go, come, leave, open), not the -ing form and not “to + verb.”

Typical examples (object + base verb)

  • I saw him cross the street.
  • We heard the door slam.
  • She watched the kids run into the yard.
  • They noticed the lights flicker.
  • I felt the building shake.
  • He saw her pick up the phone.
  • We heard someone knock at the window.
  • She watched him open the box.
  • They saw the cat jump onto the table.
  • I heard the baby cry and then stop.
  • He noticed her eyes widen.
  • We watched the plane land.
  • She heard him say her name.
  • I saw the waiter spill the water.
  • They felt the car swerve.

Bare infinitive vs. -ing: what changes

  • Bare infinitive focuses on the action as complete: “I saw him fall.” (you witnessed the whole fall)
  • -ing focuses on the action in progress: “I saw him falling.” (you saw part of it happening)
  • In real use, both can be possible; the choice depends on whether the speaker highlights the whole event or the ongoing action.

Common learner issues (and fixes)

  • ✅ I saw her leave. ❌ I saw her to leave.
  • ✅ We heard them argue. ❌ We heard them to argue.
  • ✅ She watched him unlock the door. ❌ She watched him to unlock the door.
  • Use the base verb after the object; do not add to in active sentences with these perception verbs.

Passive forms: “to” usually returns

  • Active: They saw him enter the building.
  • Passive: He was seen to enter the building.
  • Active: We heard her sing.
  • Passive: She was heard to sing.
  • This is a standard pattern: in the passive, English typically uses to + infinitive after the past participle (was seen/heard to do).

Sentence patterns with see, hear, watch, notice, and feel

Perception verb infinitive and gerund patterns

These perception verbs commonly take either a bare infinitive (base verb) or an -ing form to describe what you observed. The choice usually depends on whether you mean a complete event (bare infinitive) or an action in progress (verb + -ing).

Pattern Meaning focus Example Notes
Verb + object + bare infinitive Whole action / complete event I saw him cross the street. Common with finished actions and clear endpoints.
Verb + object + -ing Action in progress / partial observation I saw him crossing the street. Often suggests you noticed the action mid-way.
Verb + object + past participle Result/state observed I noticed the window broken. Describes a condition, not the action of breaking.
Verb + wh- clause Information noticed (what/where/how, etc.) Did you hear what she said? Use when the “object” is an idea or detail, not an action.
Verb + that clause Fact perceived/realized I noticed that he was nervous. More common with notice/feel than with watch.

Bare infinitive vs. -ing: choosing the right meaning

  • Bare infinitive when you mean the event as a complete unit: “We watched the plane take off.”
  • -ing when you focus on the ongoing activity: “We watched the plane taking off.” (you observed part of the process)
  • With short, clearly bounded actions, the base verb often sounds more natural: “I heard the door slam.”
  • With longer activities, the -ing form is common: “I heard people talking in the hallway.”
  • Some contexts allow both with little difference; the surrounding time words can guide the choice: “for a while” often fits -ing, while “suddenly” often fits the bare infinitive.

Common patterns and example sentences

  • see + object + bare infinitive: I saw her open the envelope.
  • see + object + -ing: I saw her opening the envelope when I walked in.
  • hear + object + bare infinitive: We heard someone knock at the door.
  • hear + object + -ing: We heard someone knocking while we were eating.
  • watch + object + bare infinitive: They watched the children run to the bus.
  • watch + object + -ing: They watched the children running around the yard.
  • notice + object + bare infinitive: I noticed him pick up my keys.
  • notice + object + -ing: I noticed him picking up my keys as I turned around.
  • feel + object + bare infinitive: I felt the ground shake.
  • feel + object + -ing: I felt the ground shaking under my feet.
  • notice + object + past participle: She noticed the door locked.
  • see + object + past participle: I saw the car parked outside.
  • hear + wh-clause: I didn’t hear what he asked.
  • notice + that-clause: He noticed that the room was unusually quiet.
  • feel + that-clause: I felt that something was wrong.

Form notes and common mistakes

  • Use the base verb (not “to”) after these verbs in active voice: ✅ “I saw him leave.” ❌ “I saw him to leave.”
  • In passive voice, “to” is used: “He was seen to leave the building.”
  • If there is no clear object, switch to a different structure: ✅ “I heard someone singing.” (object = someone) vs. “I heard singing” (possible, but less specific).
  • Past participle describes a state you perceived, not the action: “I noticed the computer turned off.” (state) vs. “I noticed him turn off the computer.” (action)

Difference between bare infinitive and -ing form after perception verbs

After verbs like see, hear, and watch, English commonly uses either the bare infinitive (base verb) or the -ing form. The choice signals how you view the action: as a complete event (from start to finish) or as something in progress (a slice of the action).

Form after the object Typical meaning Common pattern Example
Bare infinitive (base verb) You perceived the whole action as a complete event (often start to finish), or as a single, bounded occurrence. see/hear/watch + object + base verb I saw him cross the street.
-ing form You perceived the action in progress (an ongoing process), without focusing on its endpoint. see/hear/watch + object + verb-ing I saw him crossing the street.
Bare infinitive (with repeated/typical events) Often suggests repeated actions or a sequence you observed as a series of complete units. see/hear/watch + object + base verb We watched the players shake hands after the match.
-ing form (with background activity) Often sets a scene or background activity you noticed while something else happened. see/hear/watch + object + verb-ing I heard people talking in the hallway.

How the meaning changes in real sentences

  • Completed event: “I heard her sing the national anthem.” (you heard the performance as a whole)
  • In progress: “I heard her singing in the shower.” (you caught part of the action)
  • Single, bounded action: “He saw the glass fall off the table.” (the fall is a complete event)
  • Ongoing process: “He saw the glass falling and tried to catch it.” (focus on the action as it was happening)
  • Sequence as complete units: “We watched the kids run to the bus, climb in, and wave goodbye.”
  • Scene-setting: “We watched the kids running around while we waited.”

Quick usage notes and common patterns

  • Both forms are most natural when the perception verb is used for direct sensory experience (not opinion). Compare:
    • ✅ “I saw him leave.” (direct observation)
    • ✅ “I saw him leaving.” (direct observation, in progress)
    • ❌ “I saw that he leave.” (wrong structure for this meaning)
  • The object is usually required: “I saw him leave / leaving.” Without an object, you typically need a different structure (e.g., “I saw that he left”).
  • -ing is especially common when you notice something already happening: “I walked in and saw them arguing.”
  • The bare infinitive is especially common for short, clearly bounded actions: “I heard the door slam.”
  • With watch, the -ing form is very frequent because watching often implies observing an activity over time: “She watched him working.”

Expanded example set (choose the form that matches the viewpoint)

  • I saw the cat jump onto the counter. / I saw the cat jumping onto the counter.
  • They heard the baby cry. / They heard the baby crying.
  • We watched the plane land. / We watched the plane landing.
  • She saw him open the envelope. / She saw him opening the envelope.
  • I heard someone knock at the door. / I heard someone knocking at the door.
  • He watched the chef slice the fish. / He watched the chef slicing the fish.
  • They saw the lights go out. / They saw the lights going out.
  • We heard the crowd cheer. / We heard the crowd cheering.
  • She saw him pick up the keys. / She saw him picking up the keys.
  • I heard the dog bark twice. / I heard the dog barking outside.
  • He watched the children build a sandcastle. / He watched the children building a sandcastle.
  • They saw the cyclist fall. / They saw the cyclist falling and ran over.

How meaning changes between 'see someone do' and 'see someone doing'

Perception verb infinitive versus -ing meaning contrast

The choice between the bare infinitive and the -ing form after see changes what you highlight: a complete action (viewed as a whole) or an action in progress (viewed as a slice of time). Both patterns are common, but they are used for different kinds of “visual evidence” in a sentence.

Pattern Typical meaning What the speaker focuses on Example
see + object + bare infinitive (do) The action is seen as complete (or as a whole event) Beginning-to-end, the full event, the result I saw him cross the street.
see + object + -ing (doing) The action is seen in progress (not necessarily finished) A moment in the middle, the ongoing activity, the scene I saw him crossing the street.
see + object + bare infinitive (do) Often sounds more “report-like” for single, countable events A specific action that happened once We saw the goalkeeper save the penalty.
see + object + -ing (doing) Often sounds more descriptive for scenes and background activity What was happening at that time We saw the goalkeeper warming up near the goal.

Choosing the bare infinitive: “the whole action”

Use see + object + do when you want to present the action as a complete event, even if it only lasted a few seconds. This is common when the action has a clear endpoint (finish, leave, fall, open, close) or when you are reporting what you witnessed from start to finish.

  • I saw her pick up the wallet and put it in her bag.
  • They saw the plane land.
  • We saw the lights go out.
  • He saw the child trip and fall.
  • Did you see him enter the building?
  • I saw the cat jump onto the counter.
  • She saw the door open and someone step inside.
  • They saw the runner cross the finish line.
  • I saw him take the keys from the table.
  • We saw the bus pull away.

Choosing -ing: “part of the action”

Use see + object + doing when you want to describe what was happening at a particular moment. The focus is on the ongoing activity, not on whether it finished. This is especially natural for longer actions and for setting a scene.

  • I saw her talking to the manager.
  • They saw people waiting outside the station.
  • We saw him running down the stairs.
  • She saw the dog digging in the garden.
  • Did you see anyone using your computer?
  • I saw the kids playing in the street.
  • He saw smoke coming from the kitchen.
  • We saw her looking through the window.
  • They saw the waiter carrying a tray of drinks.
  • I saw him standing near the exit.

Same situation, different meaning

In many contexts, both forms are possible but they lead the listener to imagine a different viewpoint. The bare infinitive suggests you witnessed the event as a complete unit; the -ing form suggests you noticed it while it was happening.

  • I saw him leave the office. → You witnessed the departure (the moment he went out).
  • I saw him leaving the office. → You noticed the process (he was on his way out).
  • She saw the thief grab her phone. → A single completed action.
  • She saw the thief grabbing her phone. → The action in progress; emphasis on the scene.
  • We saw the chef taste the sauce. → One complete tasting action.
  • We saw the chef tasting the sauce. → Ongoing activity in the kitchen at that moment.

Common learner issues

  • Use the bare infinitive after see for the whole event: ✅ I saw him fall. ❌ I saw him to fall.
  • Prefer -ing when you did not see the action finish: ✅ I saw her walking toward the station (but you may not know if she arrived).
  • When describing a “snapshot” scene, -ing usually sounds more natural than a bare infinitive: ✅ I saw them arguing in the hallway.
  • When reporting a short, countable event (especially in narratives), the bare infinitive is often the default: ✅ I saw him drop the glass.

Common contexts where perception verbs appear in real conversations

Verbs like see, hear, and watch often introduce a second action: someone does something, and you perceive it. In everyday speech, this usually shows up in two main patterns: verb + object + base verb (complete action) and verb + object + -ing (action in progress).

1) At home: noticing everyday actions

  • see + object + base verb: “I saw him lock the door.” (the whole action is viewed as complete)
  • see + object + -ing: “I saw him locking the door.” (focus on the action as it was happening)
  • “I heard the baby cry.” / “I heard the baby crying.”
  • “We watched the kids build a fort.” / “We watched the kids building a fort.”
  • “Did you hear someone knock?” (single, completed sound event)
  • “I heard someone knocking for a while.” (repeated/continuous)

2) On the street: quick observations and reports

  • “I saw a cyclist run the red light.” (one complete event)
  • “I saw a cyclist running the red light.” (you noticed it in progress)
  • “We heard a car backfire.” (short, finished sound)
  • “I heard people shouting outside.” (ongoing noise)
  • “Did you see that dog jump the fence?” (whole action)
  • “I watched the bus pulling away.” (moment in progress)

3) Work and school: monitoring, checking, and supervision

  • “I watched her present the results.” (the presentation as a complete event)
  • “I watched her presenting and taking questions.” (focus on the process)
  • “The teacher saw him copy the answer.” (clear, completed act)
  • “The teacher saw him copying during the test.” (ongoing behavior)
  • “I heard the manager say we’re changing the schedule.”
  • “We heard them discussing the budget in the next room.”

4) Entertainment and media: performances, clips, and live events

  • “I watched the band play that song live.”
  • “I watched the band playing while everyone sang along.”
  • “Did you see her score?” (sports highlight, completed moment)
  • “I saw her scoring again and again.” (repeated action)
  • “We heard the actor forget a line.”
  • “I heard the crowd cheering from outside.”

5) Conversations about evidence: what you personally witnessed

  • Use these structures to make your report sound direct: “I saw him take it,” “I heard her admit it.”
  • For background or partial observation, the -ing form is common: “I saw him taking photos,” “I heard her talking about it.”
  • Negative forms usually place not on the main verb: “I didn’t see him leave.” / “I didn’t hear them arguing.”
  • Questions keep the perceived action in the infinitive/-ing form: “Did you see her open it?” / “Did you hear him opening it?”

6) Common usage notes that affect choice

  • Base verb often suggests a complete event or a single moment: “I heard it click.”
  • -ing often suggests duration, repetition, or an action in progress: “I heard it clicking all night.”
  • With watch, the -ing form is especially natural when you observe a process: “We watched them working.”
  • With hear, the base verb is common for short sounds: “I heard the door slam.”
  • When the object is a pronoun, it stays before the second verb: “I saw him leave,” “I heard them laughing.”

Typical mistakes learners make with perception verb structures

Errors with see, hear, and watch often come from mixing up which verb form signals a complete action, an action in progress, or a passive meaning. The points below focus on the most common pattern problems and how to correct them.

  • Using to after a perception verb in active meaning

    In active structures, these verbs usually take a bare infinitive (no to) or an -ing form.

    ❌ I saw him to cross the street. → ✅ I saw him cross the street.

  • Choosing the wrong form: bare infinitive vs. -ing

    Use the bare infinitive for the whole action (from start to finish) or the action as a complete event; use -ing for something in progress or as an unfolding scene.

    • ✅ I watched her paint the fence. (the whole task / completed event)
    • ✅ I watched her painting the fence. (in progress / part of the action)
    • ❌ I heard him sing when I walked in (if you mean “already in progress”). → ✅ I heard him singing when I walked in.
  • Forgetting that passive structures usually require to

    In the passive, the bare infinitive typically changes to to + infinitive.

    • ✅ He was seen to leave the building.
    • ❌ He was seen leave the building.
    • ✅ They were heard to argue in the hallway.
  • Using a that-clause when an object + verb form is needed

    With direct perception, English often prefers object + bare infinitive/-ing rather than a that-clause.

    • ❌ I saw that he fell. → ✅ I saw him fall.
    • ❌ I heard that they were shouting. → ✅ I heard them shouting.
  • Leaving out the object (or using the wrong object form)

    These patterns normally need an object between the perception verb and the following verb form.

    • ❌ I saw cross the street. → ✅ I saw him cross the street.
    • ❌ I heard she sing. → ✅ I heard her sing.
  • Confusing perception verbs with verbs of opinion or discovery

    See can mean “understand” or “find out,” and then the grammar changes (often a clause is natural). Learners sometimes apply the direct-perception pattern where it does not fit.

    • ✅ I see that you’re upset. (understand)
    • ✅ I saw you crying. (direct perception)
  • Using -ing when the meaning is a single, complete event

    If you mean one clear action (especially something quick), the bare infinitive is often the better choice.

    • ✅ I heard the door slam.
    • ✅ I saw the balloon burst.
    • ❌ I saw the balloon bursting (unless you mean a longer process, which is unusual here).
  • Using the bare infinitive when the action is clearly in progress

    If you walked in during the middle of an action, -ing usually matches the meaning better.

    • ✅ I saw him running down the street.
    • ✅ She heard the baby crying.
    • ❌ She heard the baby cry (if you mean “already crying,” not “started and finished crying”).
  • Adding an extra verb or repeating the idea unnecessarily

    Keep the structure tight: perception verb + object + verb form. Avoid doubling with another verb like can or be able to unless it changes meaning.

    • ❌ I could see him to run. → ✅ I could see him running.
    • ❌ I saw him was running. → ✅ I saw him running.
  • Mixing up look at with watch

    Watch suggests observing over time; look at is simply directing your eyes. Learners sometimes choose the wrong verb and then force an unnatural structure.

    • ✅ I watched them play for an hour. (ongoing observation)
    • ✅ I looked at the painting. (brief visual attention)
  • Misplacing adverbs so the meaning becomes unclear

    Place adverbs where they clearly modify the perceived action, not the act of perceiving (unless that is what you mean).

    • ✅ I saw him quietly open the window. (he was quiet)
    • ✅ I clearly saw him open the window. (my perception was clear)
  • Using these patterns when the perception is not direct

    The object + bare infinitive/-ing structure fits best when you directly witness the action. If you infer it from evidence, a different structure is usually more natural.

    • ✅ I saw him leave. (I witnessed it)
    • ✅ I could see that he had left. (I inferred it from clues, e.g., his car was gone)

Practice exercises: choose between bare infinitive and -ing form

Choose the correct complement after verbs of perception (see, hear, watch, notice, feel, observe). Use the bare infinitive for a complete action (often viewed as a whole event) and the -ing form for an action in progress or a developing scene.

Exercise 1: Select the correct form

Complete each sentence with the correct option: verb + object + bare infinitive or verb + object + -ing.

  1. I saw the bus (stop / stopping) at the corner.
  2. We watched the kids (build / building) a sandcastle.
  3. She heard her name (call / calling) from the hallway.
  4. They noticed the lights (flicker / flickering) during the storm.
  5. He felt the ground (shake / shaking) under his feet.
  6. I heard the door (slam / slamming) upstairs.
  7. Did you see him (take / taking) the keys from the table?
  8. We watched the plane (disappear / disappearing) into the clouds.
  9. She saw the cat (jump / jumping) onto the counter.
  10. I noticed him (look / looking) at his phone during the meeting.
  11. They heard the baby (cry / crying) in the next room.
  12. We saw the athlete (cross / crossing) the finish line.
  13. He watched the chef (slice / slicing) the onions.
  14. I felt someone (tap / tapping) my shoulder.
  15. She heard the wind (howl / howling) all night.
  16. They noticed the waiter (spill / spilling) water on the floor.
Show answers
  1. stop
  2. building
  3. called
  4. flickering
  5. shake
  6. slam
  7. take
  8. disappear
  9. jump
  10. looking
  11. crying
  12. cross
  13. slicing
  14. tap
  15. howling
  16. spilling

Exercise 2: Meaning check (whole event vs. in progress)

Choose the option that best matches the meaning in brackets.

  1. I saw her (enter / entering) the building. (I witnessed the complete action: outside → inside.)
  2. I saw her (enter / entering) the building. (I noticed her in the middle of the action, not necessarily the end.)
  3. We heard the singer (finish / finishing) the last note. (We heard the end point.)
  4. We heard the singer (finish / finishing) the last note. (We heard the performance as it was happening.)
  5. They watched the dog (run / running) across the road. (They saw the full crossing.)
  6. They watched the dog (run / running) across the road. (They observed it in progress.)
  7. I noticed him (put / putting) the letter in the envelope. (I saw the action completed.)
  8. I noticed him (put / putting) the letter in the envelope. (I caught a glimpse mid-action.)
Show answers
  1. enter
  2. entering
  3. finish
  4. finishing
  5. run
  6. running
  7. put
  8. putting

Exercise 3: Correct the form

Each sentence has a form that is unnatural for the intended meaning in parentheses. Rewrite by changing only the complement (bare infinitive or -ing).

  1. I watched him to open the window. (Use the standard perception-verb pattern.)
  2. She heard the glass to break. (Use the standard perception-verb pattern.)
  3. We saw the thief running into the shop and then out again. (Focus on the complete sequence.)
  4. I noticed her write in her notebook while the teacher spoke. (Focus on the action in progress.)
  5. They felt the elevator moving and then stop suddenly. (Focus on the stop as a completed event.)
  6. He heard the judge speaking the final words of the sentence. (Focus on the end point.)
  7. She saw the child fall and lie still. (Focus on the fall as it happened, not only the end.)
  8. We watched the sun set behind the hills for an hour. (Focus on the process.)
Show answers
  1. I watched him open the window.
  2. She heard the glass break.
  3. We saw the thief run into the shop and then out again.
  4. I noticed her writing in her notebook while the teacher spoke.
  5. They felt the elevator move and then stop suddenly.
  6. He heard the judge speak the final words of the sentence.
  7. She saw the child falling and lie still.
  8. We watched the sun setting behind the hills for an hour.

Quick reminders to apply while you answer

  • Use object + bare infinitive when the speaker presents the action as a whole: “I saw him leave.”
  • Use object + -ing when the speaker highlights an ongoing scene: “I saw him leaving.”
  • Do not use to + infinitive after see/hear/watch in this structure: ❌ “I saw him to leave.” ✅ “I saw him leave.”
  • With short, sudden events (slam, drop, break), the bare infinitive is often the natural choice.
  • With longer activities (walk, talk, work), the -ing form often fits when you mean “in progress.”
Ievgen Iesipovych, author of LingoHarvest
About the author

Ievgen Iesipovych is the creator of LingoHarvest, a project focused on simple and practical language learning. He writes clear English-learning guides with real-life examples, step-by-step explanations, and exercises designed for self-study learners.

Read more about the author
Related articles
Have a question?
Ask your question
Ask about this topic or share your thoughts. Your email will only be used to notify you if someone replies. Required fields are marked * .
reload, if the code cannot be seen